fix(codex): synthesize failed tool.result for orphan tool.call (#86808)#87228
fix(codex): synthesize failed tool.result for orphan tool.call (#86808)#87228Sanjays2402 wants to merge 1 commit into
Conversation
… no terminal result (openclaw#86808) When the Codex app-server runtime drops a turn after persisting a `tool.call` (denied auto-approval, interrupted sandbox, crashed runtime), no matching `tool.result` was emitted into the mirrored transcript or the trajectory recorder. The downstream invariant 'every persisted tool.call has exactly one terminal tool.result' broke, the OpenClaw session resumed with an orphan tool call, and the next prompt could be rejected with SYSTEM_RUN_DENIED. Track tool transcript call ids by name and trajectory call ids explicitly, and on `buildResult` synthesize a terminal `status: failed, reason: missing_tool_result` entry plus a matching mirrored toolResult message for every orphan id. Bubble the synthetic error into `promptError` so the attempt is reported as failed instead of silently swallowed. Adds a regression test reproducing the orphan tool.call path from the issue report (commandExecution with no completion) and asserts the mirrored toolResult, synthesized trajectory event, and propagated promptError.
|
Codex review: needs real behavior proof before merge. Reviewed May 29, 2026, 1:12 AM ET / 05:12 UTC. Summary PR surface: Source +50, Tests +75. Total +125 across 2 files. Reproducibility: unclear. The review failed before ClawSweeper could establish a reproduction path. Review metrics: none identified. Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Risk before merge
Maintainer options:
Next step before merge
Review detailsBest possible solution: Retry the Codex review after fixing the execution failure. Do we have a high-confidence way to reproduce the issue? Unclear. The review failed before ClawSweeper could establish a reproduction path. Is this the best way to solve the issue? Unclear. Retry the review first so ClawSweeper can evaluate the actual issue and fix direction. AGENTS.md: unclear because the file could not be read completely. Codex review notes: model gpt-5.5, reasoning high; reviewed against 1188aa3b81ef. Label changesLabel changes:
Label justifications:
Evidence reviewedPR surface: Source +50, Tests +75. Total +125 across 2 files. View PR surface stats
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
|
ClawSweeper PR egg 🎁 Pass real behavior proof to wake the egg and unlock a hatchable treat. Where did the egg go?
|
Closes #86808.
Problem
When a Codex app-server turn ended after persisting a
tool.callbut before the matchingtool.result(denied auto-approval, interrupted sandbox, runtime crash), the projector mirrored the call into the OpenClaw transcript and trajectory but never emitted a terminal result. The downstream invariant every persisted tool.call has exactly one terminal tool.result broke; the session resumed with an orphan tool call and the next prompt was rejected withSYSTEM_RUN_DENIED.Fix
CodexAppServerEventProjectornow:toolTranscriptNamesById).toolTrajectoryCallIds/toolTrajectoryResultIds).buildResult, synthesizes astatus: "failed", reason: "missing_tool_result"trajectory event and a mirroredtoolResultmessage for every call id without a terminal result.SYSTEM_RUN_DENIEDpromptErrorso the attempt is classified as failed instead of silently swallowed.Regression test
extensions/codex/src/app-server/event-projector.test.ts— fails closed and synthesizes a result when a native tool call never completes:item/startedfor acommandExecution(no completion event),turn/completedand anagentMessage,promptError, the synthesizedtoolResultmessage shape (toolCallId,toolName=bash,isError=true, error text), and the synthesized trajectorytool.resultevent.Verified the test fails on
main(expected 'null' to contain 'SYSTEM_RUN_DENIED') and passes with the fix.Notes
tool.callalready has a result (synthesize loop is a no-op).status: "failed",result: { status: "failed", reason: "missing_tool_result" }, mirroring how the runtime would describe a denied tool today.