test: guard websocket stale final turn lineage#78147
Conversation
|
Codex review: needs real behavior proof before merge. Summary Reproducibility: unclear. The review failed before ClawSweeper could establish a reproduction path. Real behavior proof Next step before merge Review detailsBest possible solution: Retry the Codex review after fixing the execution failure. Do we have a high-confidence way to reproduce the issue? Unclear. The review failed before ClawSweeper could establish a reproduction path. Is this the best way to solve the issue? Unclear. Retry the review first so ClawSweeper can evaluate the actual issue and fix direction. What I checked:
Likely related people:
Remaining risk / open question:
Codex review notes: model gpt-5.5, reasoning high; reviewed against 3b626e4e36fe. |
|
Heads up — this isn't actually test-only. The PR's runtime change introduces a buffer-loss bug that the new test doesn't surface. The bug. In …then So if turn B has accumulated any deltas before a stale The added test ( Smallest fix — six lines deleted, no other changes: Delete the three Add the missing interleaved-delta regression test: Today this test fails on this PR head; after the six-line removal it passes. Plus: Alternative shape if you'd rather have cleaner history: split into a runtime-fix PR (clear-removal + interleaved-delta test) and keep this one as test-only against the post-fix tree. |
…inal events Stale response.completed / response.failed events arriving mid-turn were clearing outputItemPhaseById, outputTextByPart, and emittedTextByPart before returning, which silently dropped subsequent text deltas for the active turn (phase metadata gone) or duplicated already-emitted text (emitted-text tracker gone). 'Ignore stale' now leaves the active-turn state intact so live deltas continue streaming uninterrupted. Adds a regression test that emits deltas for turn B, sends a stale response.completed carrying turn-A metadata mid-stream, then asserts all turn-B text_delta events reach the consumer in order with no drops or duplicates. The existing stale-completed test (no interleaved deltas) still passes.
|
Thanks @100yenadmin. Closing this as superseded by #79726. This PR was plausible while OpenClaw owned the OpenAI Responses WebSocket event loop, but the better fix is to delete that owner boundary instead of hardening it. #79726 removes the custom That means the stale Proof in #79726:
#78055 is still referenced from #79726, but I am not treating this PR as the canonical issue fix because that issue also contains non-WS subagent announce/session-history contamination evidence. |
Summary
response.completedcarrying turn A metadata arrives before the real turn B completionresponse.completed/response.failedevents instead of finalizing the active turnResponseObject.metadataso tests and runtime can inspect echoed Responses metadataWhy
This ties directly to #78055 and the duplicate/stale final-answer after tool-chain report. It also overlaps the incremental / previous_response_id surfaces discussed around #76905, #76888, #77642, #78060, and the broader runtime/finalization issues in #76990 / #77445.
Verification
./node_modules/.bin/tsc --noEmit --pretty false --project tsconfig.core.projects.json✅vitest run src/agents/openai-ws-stream.test.ts -t "stale completed" --maxWorkers=1 --no-fileParallelism --reporter=dotstarted, the targeted test emitted a passing dot, but the local process was SIGKILLed afterward on this Mac with/System/Volumes/Dataat 100% / ~121MiB free. Treat full test run as not completed locally.Fixes #78055.
Refs #76905, #76888, #77642, #78060, #76990, #77445.