Fix Codex native thread overflow rotation#88207
Conversation
|
Codex review: needs maintainer review before merge. Reviewed May 29, 2026, 10:49 PM ET / 02:49 UTC. Summary PR surface: Source +169, Tests +379. Total +548 across 7 files. Reproducibility: yes. Current main source skips the native token-pressure scan under default config, and the PR body provides observed logs plus a seeded installed-package repro for the overflow state. Review metrics: 1 noteworthy metric.
Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Risk before merge
Maintainer options:
Next step before merge
Security Review detailsBest possible solution: Land the scoped Codex fix after maintainers accept the default fresh-thread rotation tradeoff and focused Codex checks are green. Do we have a high-confidence way to reproduce the issue? Yes. Current main source skips the native token-pressure scan under default config, and the PR body provides observed logs plus a seeded installed-package repro for the overflow state. Is this the best way to solve the issue? Yes. The patch keeps the fix in the Codex app-server startup/run-attempt path, preserves byte truncation as opt-in, and adds focused regression tests; the main remaining question is accepting the intentional default fresh-thread behavior. AGENTS.md: found and applied where relevant. Codex review notes: model gpt-5.5, reasoning high; reviewed against e9dee8dfe158. Label changesLabel changes:
Label justifications:
Evidence reviewedPR surface: Source +169, Tests +379. Total +548 across 7 files. View PR surface stats
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
ac61677 to
444d102
Compare
2a87c21 to
466bfbe
Compare
Summary
truncateAfterCompactionThis PR is intentionally scoped to the Codex terminal-overflow path:
Behavioral Proof
Before, a budget-triggered Codex app-server session entered the bad path because OpenClaw skipped non-manual Codex app-server compaction and then failed the native compaction fallback:
The matching runtime state had OpenClaw session totals below the budget but Codex native rollout pressure already near the native window:
After installing this branch locally as the packaged OpenClaw build from this PR and restarting the managed Gateway onto the same build, I recreated that state with redacted synthetic identifiers,
trigger=budget, modelopenai-codex/gpt-5.5, default compaction config, stale Codex binding, and the native rollout token snapshot above.Verification
CI=true pnpm buildpnpm pack --pack-destination <temp-dir>pnpm add -g <packed-openclaw-tarball>openclaw --version-> packaged OpenClaw build from this PRopenclaw gateway install --force --port <local-port> --wrapper <local-openclaw-wrapper>openclaw gateway status --deep-> CLI and Gateway both on the packaged PR buildnode --no-maglev <vitest> run --config test/vitest/vitest.extension-codex.config.ts extensions/codex/src/app-server/startup-binding.test.ts --reporter=verbose-> 12 passednode --no-maglev <vitest> run --config test/vitest/vitest.extension-codex.config.ts extensions/codex/src/app-server/run-attempt.test.ts -t "starts a fresh Codex thread before turn/start when the next prompt would exhaust native headroom" --reporter=verbose-> 1 passednode --no-maglev <vitest> run --config test/vitest/vitest.extension-codex.config.ts extensions/codex/src/app-server/run-attempt.context-engine.test.ts -t "starts a fresh thread instead of resuming a token-pressured thread-bootstrap binding|resumes a matching thread-bootstrap binding even when the bootstrap turn exceeded the opt-in native byte guard" --reporter=verbose-> 2 passedgit diff --checkReal behavior proof
Behavior addressed: Budget-triggered Codex app-server sessions no longer resume a native Codex thread whose persisted rollout token usage is already too close to the native context window; OpenClaw starts a fresh Codex thread before
turn/startinstead of entering the previous overflow/failed-compaction state.Real environment tested: Local packaged OpenClaw install from this PR, with the managed Gateway reinstalled/restarted onto the same package on a local loopback port.
Exact steps or command run after this patch: Built and packed the branch, installed the tarball globally, reinstalled the Gateway LaunchAgent using a local
openclawwrapper, then executed an installed-package run-attempt repro with redacted session identifiers, budget trigger, default compaction config, stale Codex binding,sessionTotalTokens=87792,contextTokens=272000, native rollouttotal_tokens=241198, andmodel_context_window=258400.Evidence after fix: Redacted terminal transcript from the installed packaged build:
Observed result after fix: Copied live output showed
thread/startbeforeturn/start, nothread/resumefor the oversized stale thread, and a rewritten binding value ofthread-fresh-before-overflow.What was not tested: A live OpenAI Codex model call was not made; the proof uses the installed OpenClaw package and a local app-server harness seeded with the persisted session/rollout token state from the observed failure.