fix(codex): prevent false completion stalls during native streams#87781
Conversation
|
Codex review: needs real behavior proof before merge. Reviewed May 29, 2026, 6:37 PM ET / 22:37 UTC. Summary PR surface: Source +154, Tests +481, Docs +10. Total +645 across 21 files. Reproducibility: yes. The related issue and PR discussion give a high-confidence log/live-flow path: Codex-backed Telegram turns reach raw assistant or item completion, then idle waiting for Review metrics: 1 noteworthy metric.
Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Proof guidance:
Mantis proof suggestion Risk before merge
Maintainer options:
Next step before merge
Security Review detailsBest possible solution: Land only after current head has redacted live Telegram/gateway proof and maintainers explicitly accept the longer timeout default plus retry/session cleanup semantics; keep narrower unsafe-stall diagnostics in #87793 if still needed. Do we have a high-confidence way to reproduce the issue? Yes. The related issue and PR discussion give a high-confidence log/live-flow path: Codex-backed Telegram turns reach raw assistant or item completion, then idle waiting for Is this the best way to solve the issue? Unclear until current-head proof is posted. The implementation follows the right Codex app-server boundary and has focused tests, but the changed default, retry, and session cleanup semantics need maintainer acceptance before this is the best merge path. AGENTS.md: found and applied where relevant. Codex review notes: model gpt-5.5, reasoning high; reviewed against 63a06e312dfe. Label changesLabel justifications:
Evidence reviewedPR surface: Source +154, Tests +481, Docs +10. Total +645 across 21 files. View PR surface stats
Acceptance criteria:
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
2575552 to
a59d953
Compare
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a59d953faf
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
Post-fix live proof has been added to the PR body. Redacted summary:
Exact chat/session/thread/turn/run/tool-call IDs, auth profile details, local usernames, and local paths are intentionally omitted. |
|
Post-fix recurrence update, redacted: After rebuild/restart on
This confirms the current PR scope: it keeps completion semantics safe and handles replay-safe recovery, but it does not solve non-replay-safe post-tool assistant-progress stalls with side effects. Those need a separate recovery strategy or an upstream app-server fix. Exact chat/session/thread/turn/run/tool-call IDs, usernames, and local paths omitted. |
|
Implemented the next fix after the post-restart recurrence. The recurrence was a side-effecting/progress timeout, so auto-replay remains intentionally blocked. New commit
Validation run:
Remaining live validation: rebuild/restart on head |
|
@clawsweeper re-review |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
|
Codex Review: Didn't find any major issues. Breezy! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
Codex Review: Didn't find any major issues. 👍 ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
@clawsweeper hatch |
|
@clawsweeper review |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
|
Codex Review: Didn't find any major issues. Swish! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Dependency graph guard clearedThis PR no longer has blocked dependency graph changes. A future dependency graph change requires a fresh
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c005ba9b70
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| shouldSurfaceCodexCompletionTimeout = | ||
| attempt.codexAppServerFailure?.kind === "turn_completion_idle_timeout" && | ||
| attempt.timedOut; |
There was a problem hiding this comment.
Gate timeout surfacing on completion watchdogs
When the Codex app-server watch fires for the progress or terminal watchdog, run-attempt.ts still reports the broad kind: "turn_completion_idle_timeout" and only distinguishes the actual watchdog in turnWatchTimeoutKind. This condition ignores that narrower field, so a turn_progress_idle_timeout or turn_terminal_idle_timeout now bypasses the normal prompt-error/failover path and is returned as the recoverable completion-timeout payload even though the retry resolver explicitly rejects non-completion watchdogs. Please also require turnWatchTimeoutKind === "completion" here (or handle the legacy-absent case deliberately).
Useful? React with 👍 / 👎.
|
Behavior addressed: Codex app-server replay-safe completion-idle stalls retry once; non-completion turn-watch timeouts no longer replay; retry-exhausted/non-stdio completion timeouts surface user timeout payloads; native response deltas no longer disarm the post-tool completion guard incorrectly.
|
Summary
turn/completedterminal guard; progress text is still not treated as a final answer.response.custom_tool_call_input.delta, as active turn activity and attempt progress while a turn is already open.response.*.deltanotifications so they only refresh a turn when the shared app-server client has one active lease; concurrent leased turns no longer keep each other's watchdogs alive.5 * 60_000/300000ms for app-server/native paths that do not forward stream deltas, and align runtime docs plus plugin metadata.turn_completion_idle_timeoutnow falls through to the normal timeout payload path instead of rejecting the embedded run.Why this shape
The failure is not that raw assistant progress needs to become deliverable final text. The failure is that native Codex can spend a long time actively streaming a large tool input/patch after a raw assistant progress item has completed, before any completed tool-call notification exists. Without counting those
response.*.deltaevents as activity, OpenClaw can falsely fireturn.completion_idle_timeoutwhile the model is still working.This keeps the lifecycle semantics strict: final delivery still requires
turn/completed; active native stream deltas only keep the watchdog from killing a live turn. The latest review fixes also avoid cross-talk for unscoped deltas and keep retry-exhausted completion-idle failures on the timeout-result path.Real behavior proof
Codex stopped before confirming the turn was complete. Some work may already have been performed; verify the current state before retrying.2aa61b24eais rebased ontoorigin/main040eba1cdc; the final commit on top is test-only stabilization after the runtime proof.turn.completion_idle_timeout/Codex stopped before confirming the turn was completeerror after the gateway was built and restarted from the PR branch.Review feedback addressed
Latest head:
2aa61b24eaonorigin/mainbase040eba1cdc.postToolRawAssistantCompletionIdleTimeoutMsdefaults to300000ms.turn_completion_idle_timeoutfailures now surface the timeout payload instead of throwing the prompt error after the one retry has already been used.Tests / validation
git diff --check origin/main...HEADpassed on head2aa61b24ea.oxfmt --checkpassed on changed TypeScript/plugin files on head2aa61b24ea.2aa61b24ea:6files /82tests.Focused Vitest command:
Related issue