[Bug]: Discord agent session remains routable after timeout, causing partial-success plus generic failure

### Summary

A Discord-routed agent turn can complete useful side effects, then remain stuck in `processing` until the CLI timeout fires. OpenClaw then surfaces the generic user-facing failure message even though the work may already have been posted/applied. Later routing can still appear to target the wedged session, which makes verifier/worker state ambiguous and can trigger redundant follow-up dispatches.

This may be related to the `claude-cli` regression tracked in #72434, but the problematic behavior here is the session-health/routing outcome after a terminal timeout.

### Environment

- OpenClaw: `2026.4.24` from npm stable
- Channel: Discord
- Agent model: `anthropic/claude-opus-4-7` via the Claude CLI-backed path
- OS: macOS

### Observed behavior

1. A Discord agent turn starts and performs useful side effects. In the observed case, a review/verdict message and a local state update were successfully recorded.
2. The same session remains in `processing` and is reported as stuck for several minutes.
3. At the 900s CLI timeout, OpenClaw terminates the candidate and posts/surfaces the generic failure text:

```text
Something went wrong while processing your request. Please try again, or use /new to start a fresh session.
```

4. Follow-up routing is ambiguous: the agent looks like it can still receive work, but the session is effectively dead/wedged. A later verification had to be recovered from a separate route, while a redundant follow-up dispatch was created because the original verifier path looked silent.

### Sanitized log shape

```text
[diagnostic] lane task error: lane=session:agent:<agent>:discord:channel:<redacted>:active-memory:<redacted> durationMs=<small> error="Error: Requested agent harness "claude-cli" is not registered and PI fallback is disabled."
[diagnostic] stuck session: sessionId=unknown sessionKey=agent:<agent>:discord:channel:<redacted> state=processing age=<minutes>s queueDepth=1
[model-fallback/decision] model fallback decision: decision=candidate_failed requested=anthropic/claude-opus-4-7 candidate=anthropic/claude-opus-4-7 reason=timeout next=none detail=CLI exceeded timeout (900s) and was terminated.
Embedded agent failed before reply: CLI exceeded timeout (900s) and was terminated.
```

### Expected behavior

After a fatal timeout or pre-reply embedded-agent failure, OpenClaw should make the session health unambiguous. Any of these would be safer than silently continuing to route to the wedged session:

- mark the session failed/dead and require `/new`,
- automatically reset/roll the session before accepting more work,
- route the next turn to a fresh session,
- or surface a clear `session timed out; previous side effects may have completed` state instead of only the generic failure message.

If side effects completed before the final timeout, the user-facing state should distinguish partial-success/late-failure from total failure.

### Impact

- Users cannot tell whether the work failed or succeeded.
- Verifier/worker workflows can create duplicate dispatches because the original route appears silent.
- A watchdog sees `processing` for many minutes but the user-facing chat only gets a generic failure at the end.
- The recovery path becomes manual: inspect logs/state, identify whether side effects completed, and route a fresh verifier/session by hand.

### Redaction note

This report intentionally redacts Discord IDs, session IDs, dispatch IDs, local paths, project names, internal agent nicknames, and exact local timestamps. The included log snippets preserve only the error shape needed to diagnose the runtime behavior.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: Discord agent session remains routable after timeout, causing partial-success plus generic failure #72810

Summary

Environment

Observed behavior

Sanitized log shape

Expected behavior

Impact

Redaction note

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: Discord agent session remains routable after timeout, causing partial-success plus generic failure #72810

Description

Summary

Environment

Observed behavior

Sanitized log shape

Expected behavior

Impact

Redaction note

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions