fix(codex): make post-tool raw assistant timeout configurable#84135
fix(codex): make post-tool raw assistant timeout configurable#84135rozmiarD wants to merge 2 commits into
Conversation
|
Codex review: needs maintainer review before merge. Workflow note: Future ClawSweeper reviews update this same comment in place. How this review workflow works
Summary Reproducibility: yes. for source-level behavior, though I did not run mutating tests in this read-only review. Current main uses PR rating Rank-up moves:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. Real behavior proof Risk before merge
Maintainer options:
Next step before merge Security Review detailsBest possible solution: Ship the compatible, config-gated timeout if maintainers accept the opt-in delay tradeoff, and leave any default-budget semantic change to #84137. Do we have a high-confidence way to reproduce the issue? Yes for source-level behavior, though I did not run mutating tests in this read-only review. Current main uses Is this the best way to solve the issue? Yes for the conservative path: the PR preserves current defaults and scopes the change to an opt-in config value. The broader default-budget semantics are better handled in the linked follow-up issue rather than folded into this PR. Label changes:
Label justifications:
What I checked:
Likely related people:
Codex review notes: model gpt-5.5, reasoning high; reviewed against 5d775122c1f6. |
|
Opened follow-up design issue #84137 for the semantic/default-budget question that was intentionally left out of this conservative PR. This PR keeps the current default behavior and adds an operator-configurable post-tool budget. The follow-up issue tracks the higher-risk alternative: changing the default semantics so post-tool raw assistant completion/progress gets a distinct longer budget by default while preserving short pre-tool final assistant release. |
|
Addressed the docs alignment finding in 9d40415. The post-tool raw assistant paragraph now documents that the turn still waits for I also updated the PR body proof section to remove the stale local @clawsweeper re-review |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
9d40415 to
9a410e3
Compare
|
Updated the PR to exact head 9a410e3 rebased on upstream main 5d77512 and replaced the old proof section with Crabbox static-SSH runtime output. Runtime proof exercised the production OpenClaw Codex app-server runCodexAppServerAttempt path on a Linux lab target. Observed result: post-tool raw assistant progress did not settle after the 5ms assistant-release window, then used the configured 100ms post-tool completion guard with lastActivityReason=notification:rawResponseItem/completed and interrupted the stuck turn through turn/interrupt. Focused regression also passed locally and through Crabbox: 2 files passed, 46 tests passed, 183 skipped. @clawsweeper re-review |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
|
ClawSweeper PR egg ✨ Hatched: 🥚 common Moonlit Proofling Hatch commandComment Hatchability rules:
Rarity: 🥚 common. What is this egg doing here?
|
|
Maintainer stack note: this PR should stay narrow. Relationship to the current Codex app-server stack:
Best maintainer framing: merge this only as the conservative config knob. Keep default semantics, subagent handoff, and long-reply truncation as related but separate follow-up work. |
|
maintainer update: I created #84974 as the replacement PR because this source branch has No functional disagreement with the patch here. #84974 keeps the contributor authorship, adds the docs/changelog alignment, and now has focused Vitest + Crabbox + Blacksmith Testbox changed-gate proof:
I would leave this PR as the contributor source/credit trail and use #84974 for the mergeable maintainer branch. |
|
Maintainer closeout: closing this source PR as superseded by #84974, now squash-merged at 60d200f. This is not a functional rejection of the patch. #84974 was the writable maintainer replacement because this branch could not be updated directly; it kept the same conservative timeout behavior, fixed the post-release changelog placement under Unreleased, rebased onto current main, and passed the focused Codex app-server checks plus required GitHub checks. Keeping #84137 open for the separate default-semantics / subagent handoff decision. |
Motivation
Codex app-server currently uses the assistant completion idle timeout for the post-tool raw assistant completion guard. That keeps stuck post-tool turns fail-fast, but couples two different behaviors: final assistant release and post-tool completion wait. Heavy or trusted local workloads may need a longer post-tool wait without weakening final assistant release.
Root cause / diagnosis
Pre-tool raw assistant completion can represent final assistant output and should still release the session after the assistant idle budget.
Post-tool raw assistant completion or progress happens after a tool handoff and can represent post-tool synthesis or terminal-event wait.
Current code uses
turnAssistantCompletionIdleTimeoutMsfor the post-tool guard. That timeout defaults to 10s. Existing tests show this was intentional fail-fast behavior, so this PR adds a separate configurable budget rather than globally changing assistant release semantics.Implementation
Added
appServer.postToolRawAssistantCompletionIdleTimeoutMs.Added runtime option and config resolution for the post-tool raw assistant completion guard.
Post-tool raw assistant guard now uses the dedicated timeout. If unset, behavior remains compatible by falling back to the assistant completion idle timeout.
Pre-tool raw assistant release remains unchanged.
Updated both Codex harness docs pages so the post-tool raw assistant paragraph mentions the dedicated completion-idle guard instead of only
turn/completedor the terminal watchdog.What did not change
turn/completedhandling.requestTimeoutMs.Tests
git diff --check origin/main..HEADpassed after rebasing ontoorigin/mainat5d775122c1f6.CI=1 timeout 300s node scripts/run-vitest.mjs run --config test/vitest/vitest.extensions.config.ts extensions/codex/src/app-server/config.test.ts extensions/codex/src/app-server/run-attempt.test.ts -t "post-tool raw assistant|Codex app-server config|raw assistant progress".Regression coverage
Existing pre-tool raw assistant release behavior remains covered.
New coverage proves post-tool raw assistant completion can use a longer dedicated timeout than assistant release.
Config coverage proves the
appServerfield is accepted and resolved, and unknown fields remain rejected.Existing post-tool timeout coverage remains valid; the default behavior still falls back to the assistant completion idle timeout.
Risk
Longer post-tool timeout may delay stuck-turn detection when operators configure it too high.
Too-short timeout can interrupt legitimate post-tool synthesis.
Keeping a separate config avoids weakening final assistant release and avoids global timeout changes.
Proof / evidence
git diff --stat origin/main..HEADafter the rebase:Before: post-tool raw assistant completion armed
turn.completion_idle_timeoutwithturnAssistantCompletionIdleTimeoutMs, so the guard was tied to the final assistant release budget.After: post-tool raw assistant completion arms the same completion-idle guard with
postToolRawAssistantCompletionIdleTimeoutMs, falling back toturnAssistantCompletionIdleTimeoutMswhen unset. Pre-tool raw assistant release still uses the assistant release path.Docs now state that post-tool raw assistant progress keeps waiting for
turn/completedwhile a completion-idle guard stays armed, usingappServer.postToolRawAssistantCompletionIdleTimeoutMswhen configured and falling back to the assistant completion idle timeout otherwise.Real behavior proof
Behavior addressed: Post-tool raw assistant completion after a tool handoff uses the dedicated completion-idle timeout when configured instead of finishing on the shorter final assistant release timeout.
Real environment tested: OpenClaw PR head
9a410e3a176crebased on upstreammain5d775122c1f6, executed through Crabbox static SSH on a Linux lab target. The runtime entrypoint was the productionrunCodexAppServerAttemptCodex app-server path from this PR head.Exact steps or command run after this patch:
crabbox run --provider ssh --target linux --static-host <redacted> --static-user <redacted> --static-work-root <redacted> --shell -- 'timeout 120s node --import tsx - <<EOF ... EOF'.Evidence after fix: Terminal output from the Crabbox run:
Observed result after fix: The turn did not settle after the 5ms assistant-release budget (
settledAfter20ms: false). After the post-tool raw assistant event, the completion watchdog used the configured 100ms budget (warningTimeoutMs: 100) and retainedlastActivityReason: "notification:rawResponseItem/completed"; the stuck turn then aborted throughturn/interruptwith the existing completion-idle error.What was not tested: A live external model request was outside this proof. The proof exercised the OpenClaw app-server runtime path that owns this timeout decision, plus focused local and Crabbox regression runs for the changed files.
AI assistance
This PR was AI-assisted. I reviewed the diff, tests, and behavior before submitting.