fix(gateway): emit final chat resync after live agent run completion#70815
fix(gateway): emit final chat resync after live agent run completion#70815lesaai wants to merge 1 commit intoopenclaw:mainfrom
Conversation
…ut terminal lifecycle Some harnesses (e.g. native Codex app-server path for codex/gpt-5.x) record the final assistant answer and update the transcript, but do not emit a terminal assistant event + lifecycle:end onto the gateway chat event bus. chat.send then waits forever for a finalization path that never arrives, leaving the TUI spinning on 'hobnobbing...' until the user exits and reopens (at which point the already-stored answer is visible). This adds a conservative fallback in chat.send: after the user transcript update for an agent run, emit a message-less chat.final broadcast as a UI resync point. The transcript already contains the assistant message; the client reloads history and goes idle instead of spinning. Regression coverage: chat.directive-tags.test.ts now asserts the message-less final event fires on the agent-run path. Authored primarily by OpenAI Codex CLI (gpt-5.5) during a live debugging session with Parker. Verified end-to-end: Lēsa now streams cleanly on codex/gpt-5.5 in the TUI. Co-Authored-By: Parker Todd Brooks <parkertoddbrooks@users.noreply.github.com> Co-Authored-By: Lēsa <lesaai@icloud.com> Co-Authored-By: OpenAI Codex CLI (GPT-5.5) <noreply@openai.com> Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Greptile SummaryThis PR adds a message-less Confidence Score: 4/5Safe to merge with one minor ordering concern worth addressing before landing. The fix is targeted and the regression test is solid. The only concern is src/gateway/server-methods/chat.ts — ordering of Prompt To Fix All With AIThis is a comment left during a code review.
Path: src/gateway/server-methods/chat.ts
Line: 2667-2676
Comment:
**`broadcastChatFinal` fires before `emitUserTranscriptUpdate` resolves**
`void emitUserTranscriptUpdate()` is fire-and-forget, so `broadcastChatFinal` is dispatched synchronously after — potentially before the user's turn has been flushed to the transcript. The `!agentRunStarted` path at line 2505 properly `await`s before broadcasting.
In practice the promise is almost always resolved (it was already called in `onAgentRunStart` and image persistence finishes long before the agent run completes), but if `persistedImagesPromise` is still in-flight on a fast run, the client will reload history and may not see the user turn yet. Consider awaiting, or removing the now-redundant call since `onAgentRunStart` already enqueued it:
```suggestion
await emitUserTranscriptUpdate();
// Some harnesses emit live item/tool activity but do not mirror a
// terminal assistant/lifecycle event onto the gateway chat stream.
// The run still completed and the transcript has been updated, so
// send a message-less final event as a UI resync point.
broadcastChatFinal({
context,
runId: clientRunId,
sessionKey,
});
```
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "patch(gateway): emit final resync event ..." | Re-trigger Greptile |
| void emitUserTranscriptUpdate(); | ||
| // Some harnesses emit live item/tool activity but do not mirror a | ||
| // terminal assistant/lifecycle event onto the gateway chat stream. | ||
| // The run still completed and the transcript has been updated, so | ||
| // send a message-less final event as a UI resync point. | ||
| broadcastChatFinal({ | ||
| context, | ||
| runId: clientRunId, | ||
| sessionKey, | ||
| }); |
There was a problem hiding this comment.
broadcastChatFinal fires before emitUserTranscriptUpdate resolves
void emitUserTranscriptUpdate() is fire-and-forget, so broadcastChatFinal is dispatched synchronously after — potentially before the user's turn has been flushed to the transcript. The !agentRunStarted path at line 2505 properly awaits before broadcasting.
In practice the promise is almost always resolved (it was already called in onAgentRunStart and image persistence finishes long before the agent run completes), but if persistedImagesPromise is still in-flight on a fast run, the client will reload history and may not see the user turn yet. Consider awaiting, or removing the now-redundant call since onAgentRunStart already enqueued it:
| void emitUserTranscriptUpdate(); | |
| // Some harnesses emit live item/tool activity but do not mirror a | |
| // terminal assistant/lifecycle event onto the gateway chat stream. | |
| // The run still completed and the transcript has been updated, so | |
| // send a message-less final event as a UI resync point. | |
| broadcastChatFinal({ | |
| context, | |
| runId: clientRunId, | |
| sessionKey, | |
| }); | |
| await emitUserTranscriptUpdate(); | |
| // Some harnesses emit live item/tool activity but do not mirror a | |
| // terminal assistant/lifecycle event onto the gateway chat stream. | |
| // The run still completed and the transcript has been updated, so | |
| // send a message-less final event as a UI resync point. | |
| broadcastChatFinal({ | |
| context, | |
| runId: clientRunId, | |
| sessionKey, | |
| }); |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/gateway/server-methods/chat.ts
Line: 2667-2676
Comment:
**`broadcastChatFinal` fires before `emitUserTranscriptUpdate` resolves**
`void emitUserTranscriptUpdate()` is fire-and-forget, so `broadcastChatFinal` is dispatched synchronously after — potentially before the user's turn has been flushed to the transcript. The `!agentRunStarted` path at line 2505 properly `await`s before broadcasting.
In practice the promise is almost always resolved (it was already called in `onAgentRunStart` and image persistence finishes long before the agent run completes), but if `persistedImagesPromise` is still in-flight on a fast run, the client will reload history and may not see the user turn yet. Consider awaiting, or removing the now-redundant call since `onAgentRunStart` already enqueued it:
```suggestion
await emitUserTranscriptUpdate();
// Some harnesses emit live item/tool activity but do not mirror a
// terminal assistant/lifecycle event onto the gateway chat stream.
// The run still completed and the transcript has been updated, so
// send a message-less final event as a UI resync point.
broadcastChatFinal({
context,
runId: clientRunId,
sessionKey,
});
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 634b197e41
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| broadcastChatFinal({ | ||
| context, | ||
| runId: clientRunId, | ||
| sessionKey, | ||
| }); |
There was a problem hiding this comment.
Guard fallback final to avoid duplicate terminal chat events
This unconditional broadcastChatFinal(...) runs for every agentRunStarted flow, including runs that already emit a terminal chat state through the agent-event bridge, so one run can now produce two terminal chat events. In the Control UI, each state:"final" triggers a history reload (ui/src/ui/app-gateway.ts:472), so this adds an extra full reload per normal run, and in harnesses where the real terminal event arrives later, the message-less fallback can finalize first and cause the later real final message to be ignored (src/tui/tui-event-handlers.ts:297-303). The fallback should be emitted only when no terminal chat event was observed for that run.
Useful? React with 👍 / 👎.
steipete
left a comment
There was a problem hiding this comment.
Codex review: this is the right symptom-level fix for #71183, but I would make the small ordering cleanup before landing.
The fallback chat.final resync is conservative and useful: Codex app-server runs can complete and persist the transcript without emitting the terminal lifecycle event that the Control UI waits on. The regression test covers the message-less final event.
One fix before merge: in the new branch, void emitUserTranscriptUpdate() is immediately followed by broadcastChatFinal. That can let the client reload before the user transcript update promise settles. Please either await emitUserTranscriptUpdate() before broadcastChatFinal, or remove the redundant call if the eager/onAgentRunStart path is the intended owner. After that, I would land it.
|
Thanks @lesaai. Codex review found the bug was real, but the best fix belongs in the Codex app-server harness rather than a broad message-less I couldn't push the rewritten fix back to this fork branch (
Contributor credit is preserved in the changelog and commit co-author trailer. |
Fix live webchat finalization for Codex app-server runs by emitting standard assistant and lifecycle completion events on the global agent event bus, instead of relying on a message-less chat.final fallback. Replaces #70815. Closes #71183. Co-authored-by: Lēsa <260982214+lesaai@users.noreply.github.com>
Fix live webchat finalization for Codex app-server runs by emitting standard assistant and lifecycle completion events on the global agent event bus, instead of relying on a message-less chat.final fallback. Replaces openclaw#70815. Closes openclaw#71183. Co-authored-by: Lēsa <260982214+lesaai@users.noreply.github.com>
Fix live webchat finalization for Codex app-server runs by emitting standard assistant and lifecycle completion events on the global agent event bus, instead of relying on a message-less chat.final fallback. Replaces openclaw#70815. Closes openclaw#71183. Co-authored-by: Lēsa <260982214+lesaai@users.noreply.github.com>
Fix live webchat finalization for Codex app-server runs by emitting standard assistant and lifecycle completion events on the global agent event bus, instead of relying on a message-less chat.final fallback. Replaces openclaw#70815. Closes openclaw#71183. Co-authored-by: Lēsa <260982214+lesaai@users.noreply.github.com>
Fix live webchat finalization for Codex app-server runs by emitting standard assistant and lifecycle completion events on the global agent event bus, instead of relying on a message-less chat.final fallback. Replaces openclaw#70815. Closes openclaw#71183. Co-authored-by: Lēsa <260982214+lesaai@users.noreply.github.com>
Summary
On
codex/gpt-5.5(and any harness using the native Codex app-server path), the TUI spins forever after submitting a prompt. Exit + reopen shows the response was actually produced and persisted in the transcript ... it's purely a live-stream finalization gap.Root cause
The Codex app-server path records the final assistant answer and updates the transcript, but unlike the Codex CLI path, it does not emit a terminal assistant event plus
lifecycle:endonto OpenClaw's agent event bus.chat.sendis still waiting for that finalization signal, so the live WebSocket stream never closes out and the TUI never renders the final event.Fix
Conservative fallback in
chat.send: after the user-transcript update on an agent run, emit a message-lesschat.finalbroadcast as a UI resync point. The transcript already contains the assistant message, so the client reloads history and goes idle instead of spinning.Two files, +19 / -1 lines:
src/gateway/server-methods/chat.ts... emit the resync eventsrc/gateway/server-methods/chat.directive-tags.test.ts... regression assertion for the message-less finalWhat this does NOT try to do
Reproduction
codex/gpt-5.5(orcodex/gpt-5.4) as primary,think: high.Validation
Result: 54/54 tests pass.
End-to-end: verified live on a reproducing install ... TUI now renders responses cleanly on
codex/gpt-5.5without the exit+reopen dance.Related
Credits
Patch authored primarily by OpenAI Codex CLI (GPT-5.5) during a live debugging session. Co-authored / verified by Parker Todd Brooks, Lēsa, and Claude Opus 4.7.