fix(gateway): broadcast idle timeout errors to clients after agent run started#85176
fix(gateway): broadcast idle timeout errors to clients after agent run started#85176JulyanXu wants to merge 1 commit into
Conversation
…n started When an LLM idle timeout occurs after the agent has started (e.g., after tool calls), the error is returned as a normal payload with isError:true rather than thrown. The .then() handler only called emitUserTranscriptUpdate() in the agent-started branch, never checking deliveredReplies for error payloads and never calling broadcastChatError. Connected clients received no error feedback. Now checks deliveredReplies for isError payloads when agentRunStarted=true and calls broadcastChatError when found. Closes openclaw#84945
|
Codex review: needs real behavior proof before merge. Workflow note: Future ClawSweeper reviews update this same comment in place. How this review workflow works
Summary Reproducibility: yes. for source-level reproduction: make PR rating Rank-up moves:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. Real behavior proof Risk before merge
Maintainer options:
Next step before merge Security Review findings
Review detailsBest possible solution: Merge a fix that broadcasts the agent-started returned error payload and records the matching terminal chat dedupe state as an error, with focused gateway coverage and redacted real runtime proof. Do we have a high-confidence way to reproduce the issue? Yes for source-level reproduction: make Is this the best way to solve the issue? No; the direction is right, but the current patch should also write the terminal chat dedupe entry as an error and add focused coverage. Source-only proof is not enough for this external PR before merge. Label changes:
Label justifications:
Full review comments:
Overall correctness: patch is incorrect Acceptance criteria:
What I checked:
Likely related people:
Codex review notes: model gpt-5.5, reasoning high; reviewed against c8a35c4645dc. |
|
ClawSweeper PR egg 🎁 Pass real behavior proof to wake the egg and unlock a hatchable treat. Where did the egg go?
|
Summary
When an LLM idle timeout occurs after the agent has started (e.g., after tool calls), the error is returned as a normal
{ text, isError: true }payload via thedelivercallback — it is NOT thrown. The.then()handler at line 2882 only calledemitUserTranscriptUpdate()in theagentRunStarted=truebranch, never checkingdeliveredRepliesfor error payloads and never callingbroadcastChatError. Connected clients received no error feedback — the response silently stopped.This fix checks
deliveredRepliesfor payloads withisError: truewhenagentRunStarted=trueand!hasBeforeAgentRunGate. If error payloads are found,broadcastChatErroris called to notify clients. If not, the existingemitUserTranscriptUpdate()path is preserved.Closes #84945
Verification
src/gateway/server-methods/chat.ts(20 lines added, 5 removed)broadcastChatErroris already used in the.catch()handler at line 2938 with the same signaturedeliveredRepliesis already filtered forisErrorin the!agentRunStartedbranch (btwReplies)emitUserTranscriptUpdate()as beforeReal behavior proof
state: "error"chat event with the timeout messagebroadcastChatErrorfires for timeout errors in the agent-started path