Fix context-engine compaction ownership for Codex sessions#91590
Conversation
|
Codex review: needs maintainer review before merge. Reviewed June 9, 2026, 10:14 PM ET / 02:14 UTC. Summary PR surface: Source +520, Tests +819. Total +1339 across 14 files. Reproducibility: yes. by source inspection: current main runs Codex native harness compaction before the owning context engine and the Codex native path skips budget triggers, matching the PR body's live current-main failure proof. Review metrics: 1 noteworthy metric.
Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Risk before merge
Maintainer options:
Next step before merge
Security Review detailsBest possible solution: Land this only after a Codex/session-state owner accepts the binding/projection sequencing and keep the remaining symptoms tracked in #90496. Do we have a high-confidence way to reproduce the issue? Yes, by source inspection: current main runs Codex native harness compaction before the owning context engine and the Codex native path skips budget triggers, matching the PR body's live current-main failure proof. Is this the best way to solve the issue? Yes, likely: keeping the context engine primary and making Codex native compaction a private bounded follow-up is the narrowest owner-boundary fix I found, with maintainer review still needed for session-state sequencing. AGENTS.md: found and applied where relevant. Codex review notes: model gpt-5.5, reasoning high; reviewed against 9a1f2022b127. Label changesLabel justifications:
Evidence reviewedPR surface: Source +520, Tests +819. Total +1339 across 14 files. View PR surface stats
Acceptance criteria:
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
|
@clawsweeper re-review I updated the PR body with final-head proof on
|
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
|
Follow-up hardening pushed in What changed:
Verification after this commit:
|
…91590) * fix(agents): keep context-engine compaction primary * fix(codex): request native compaction after context engines * test: cover 90496 compaction and reset edge cases * fix(codex): guard secondary native compaction binding * fix(codex): keep native compaction hint internal * fix(codex): wait for active native turns before resume
…26.6.6) (#1040) This PR contains the following updates: | Package | Update | Change | |---|---|---| | [ghcr.io/openclaw/openclaw](https://openclaw.ai) ([source](https://github.com/openclaw/openclaw)) | patch | `2026.6.5` → `2026.6.6` | --- ### Release Notes <details> <summary>openclaw/openclaw (ghcr.io/openclaw/openclaw)</summary> ### [`v2026.6.6`](https://github.com/openclaw/openclaw/blob/HEAD/CHANGELOG.md#202666) [Compare Source](openclaw/openclaw@v2026.6.5...v2026.6.6) ##### Highlights - Security boundaries are substantially tighter across transcripts, sandbox binds, host environment inheritance, MCP stdio, Codex HTTP access, native search policy, elevated sender checks, deleted-agent ACP bypasses, loopback tools, Discord moderation, and Teams group actions; exec approvals now fail closed on timeout. ([#​91529](openclaw/openclaw#91529), [#​91618](openclaw/openclaw#91618), [#​91615](openclaw/openclaw#91615), [#​91619](openclaw/openclaw#91619), [#​91741](openclaw/openclaw#91741), [#​91745](openclaw/openclaw#91745), [#​91746](openclaw/openclaw#91746), [#​91748](openclaw/openclaw#91748), [#​91749](openclaw/openclaw#91749), [#​91750](openclaw/openclaw#91750), [#​91751](openclaw/openclaw#91751), [#​91752](openclaw/openclaw#91752), [#​91763](openclaw/openclaw#91763), [#​89938](openclaw/openclaw#89938)) Thanks [@​joshavant](https://github.com/joshavant), [@​pgondhi987](https://github.com/pgondhi987), [@​mmaps](https://github.com/mmaps), [@​eleqtrizit](https://github.com/eleqtrizit), [@​shakkernerd](https://github.com/shakkernerd), and [@​drobison00](https://github.com/drobison00). - Telegram delivery is safer and more coherent: account-scoped topics route to the right agent, streamed text survives tool calls, `/compact` works on generic ingress, callback handling uses concrete APIs, draft chunking is shared, durable dispatch dedupe moved into the SDK, and unauthorized DM text stays out of cache and prompt context. ([#​91189](openclaw/openclaw#91189), [#​88682](openclaw/openclaw#88682), [#​89588](openclaw/openclaw#89588), [#​90212](openclaw/openclaw#90212), [#​91876](openclaw/openclaw#91876), [#​91874](openclaw/openclaw#91874), [#​91904](openclaw/openclaw#91904), [#​91478](openclaw/openclaw#91478), [#​91915](openclaw/openclaw#91915)) Thanks [@​codysai001](https://github.com/codysai001), [@​alexzhu0](https://github.com/alexzhu0), [@​joelnishanth](https://github.com/joelnishanth), [@​snowzlm](https://github.com/snowzlm), [@​obviyus](https://github.com/obviyus), and [@​sallyom](https://github.com/sallyom). - iMessage recovery and delivery now cover always-on inbound restart, durable echo markers, block streaming, idle approval discovery, hardened outbound transport, and actionable inbound startup diagnostics. ([#​91335](openclaw/openclaw#91335), [#​91449](openclaw/openclaw#91449), [#​88969](openclaw/openclaw#88969), [#​88530](openclaw/openclaw#88530), [#​91783](openclaw/openclaw#91783), [#​91785](openclaw/openclaw#91785)) Thanks [@​omarshahine](https://github.com/omarshahine), [@​jmissig](https://github.com/jmissig), and [@​colmbrogan](https://github.com/colmbrogan). - Browser and MCP connectivity gained existing-session CDP support, discovered WebSocket validation, default-profile `cdpUrl` handling, safer browser-output boundaries, Streamable HTTP loopback transport, corrected OAuth/SSE authorization handling, and broader schema compatibility. ([#​91422](openclaw/openclaw#91422), [#​89851](openclaw/openclaw#89851), [#​91736](openclaw/openclaw#91736), [#​91747](openclaw/openclaw#91747), [#​91451](openclaw/openclaw#91451), [#​80143](openclaw/openclaw#80143)) Thanks [@​pgondhi987](https://github.com/pgondhi987), [@​anagnorisis2peripeteia](https://github.com/anagnorisis2peripeteia), [@​lifuyue](https://github.com/lifuyue), [@​eleqtrizit](https://github.com/eleqtrizit), [@​LiuwqGit](https://github.com/LiuwqGit), and [@​HemantSudarshan](https://github.com/HemantSudarshan). - Control UI startup and first-reply latency are lower through cached model metadata, removal of the startup catalog wait, lazy slash-command loading, and first-event tracing with slow-reply diagnostics. ([#​91531](openclaw/openclaw#91531), [#​91538](openclaw/openclaw#91538), [#​91568](openclaw/openclaw#91568), [#​91583](openclaw/openclaw#91583), [#​91598](openclaw/openclaw#91598)) - Provider support expands with OpenRouter OAuth onboarding and Claude Fable 5 adaptive thinking, while Codex sessions keep correct compaction ownership, local models skip guardian review, dynamic tool progress normalizes cleanly, and Gemma 4 reasoning replay is preserved. ([#​91830](openclaw/openclaw#91830), [#​91882](openclaw/openclaw#91882), [#​91590](openclaw/openclaw#91590), [#​88630](openclaw/openclaw#88630), [#​88768](openclaw/openclaw#88768), [#​91696](openclaw/openclaw#91696)) Thanks [@​Patrick-Erichsen](https://github.com/Patrick-Erichsen), [@​joshavant](https://github.com/joshavant), [@​bdjben](https://github.com/bdjben), and [@​Coder-Wangyankun](https://github.com/Coder-Wangyankun). ##### Changes - CLI progress: emit Claude CLI commentary progress events and bridge inter-tool commentary into channel progress without exposing internal protocol scaffolding. ([#​89834](openclaw/openclaw#89834), [#​90883](openclaw/openclaw#90883)) Thanks [@​anagnorisis2peripeteia](https://github.com/anagnorisis2peripeteia). - Observability: allow trusted diagnostics channels to capture tool input/output content, add first-assistant-event traces, and warn on slow initial replies. ([#​91256](openclaw/openclaw#91256), [#​91568](openclaw/openclaw#91568), [#​91583](openclaw/openclaw#91583)) Thanks [@​amknight](https://github.com/amknight). - Plugins/ClawHub: dogfood reusable package publishing, let dry runs skip publish approval, allow declared installed trusted hooks, report managed plugin version drift, and warn instead of failing on retired Skill Workshop configuration. ([#​91574](openclaw/openclaw#91574), [#​91591](openclaw/openclaw#91591), [#​90004](openclaw/openclaw#90004), [#​90927](openclaw/openclaw#90927), [#​90838](openclaw/openclaw#90838)) Thanks [@​Patrick-Erichsen](https://github.com/Patrick-Erichsen), [@​brokemac79](https://github.com/brokemac79), and [@​lonexreb](https://github.com/lonexreb). - Memory/providers: move the local llama.cpp runtime into its provider plugin, batch embeddings across files, persist the agent model catalog cache, and keep QMD JSON search one-shot while filtering stale REM recall previews. ([#​91324](openclaw/openclaw#91324), [#​89138](openclaw/openclaw#89138), [#​90457](openclaw/openclaw#90457), [#​91837](openclaw/openclaw#91837), [#​91851](openclaw/openclaw#91851)) Thanks [@​osolmaz](https://github.com/osolmaz), [@​mushuiyu886](https://github.com/mushuiyu886), [@​ai-hpc](https://github.com/ai-hpc), and [@​TurboTheTurtle](https://github.com/TurboTheTurtle). - Channels/mobile: add the QQBot group mention toggle, improve iPad and iPhone control surfaces, and expose the active connection host in the TUI footer. ([#​91423](openclaw/openclaw#91423), [#​91557](openclaw/openclaw#91557), [#​89909](openclaw/openclaw#89909)) Thanks [@​cxyhhhhh](https://github.com/cxyhhhhh), [@​Solvely-Colin](https://github.com/Solvely-Colin), and [@​baskduf](https://github.com/baskduf). - Performance: prewarm TUI runtime plugins, deduplicate plugin auto-enable fanout, trim dense text-delta snapshots, and reuse prepared startup model metadata. ([#​90782](openclaw/openclaw#90782), [#​89978](openclaw/openclaw#89978), [#​91580](openclaw/openclaw#91580), [#​91531](openclaw/openclaw#91531)) Thanks [@​RomneyDa](https://github.com/RomneyDa) and [@​ai-hpc](https://github.com/ai-hpc). ##### Fixes - Agent/session recovery: drop stale approval follow-ups after session rebind, remove drained reply-queue items by identity, recover stale main and visible replies, preserve Codex context-engine compaction ownership, lower the default compaction timeout to 180 seconds while respecting explicit configuration, and keep provider-failure terminal lifecycle state correct. ([#​85679](openclaw/openclaw#85679), [#​91450](openclaw/openclaw#91450), [#​91566](openclaw/openclaw#91566), [#​91840](openclaw/openclaw#91840), [#​91590](openclaw/openclaw#91590), [#​91361](openclaw/openclaw#91361), [#​91895](openclaw/openclaw#91895)) Thanks [@​openperf](https://github.com/openperf), [@​yetval](https://github.com/yetval), [@​joshavant](https://github.com/joshavant), [@​wangmiao0668000666](https://github.com/wangmiao0668000666), and [@​TurboTheTurtle](https://github.com/TurboTheTurtle). - User-visible content boundaries: suppress Codex/Harmony protocol artifacts, neutralize browser and LanceDB memory media directives, redact transcript images, and preserve native `/compact` replies through source suppression. ([#​89151](openclaw/openclaw#89151), [#​91422](openclaw/openclaw#91422), [#​91425](openclaw/openclaw#91425), [#​91529](openclaw/openclaw#91529), [#​90212](openclaw/openclaw#90212)) Thanks [@​joelnishanth](https://github.com/joelnishanth), [@​pgondhi987](https://github.com/pgondhi987), [@​joshavant](https://github.com/joshavant), and [@​snowzlm](https://github.com/snowzlm). - Channel delivery: keep WhatsApp captured replies attached to the successor controller after restart, retry Feishu rate limits, preserve Mattermost thread replies, canonicalize LINE webhook paths, restore Discord reply hydration and runtime timeout exports, and show OpenAI Realtime WebRTC assistant transcripts. ([#​85823](openclaw/openclaw#85823), [#​89659](openclaw/openclaw#89659), [#​91684](openclaw/openclaw#91684), [#​91649](openclaw/openclaw#91649), [#​90263](openclaw/openclaw#90263), [#​91686](openclaw/openclaw#91686), [#​90426](openclaw/openclaw#90426)) Thanks [@​itsuzef](https://github.com/itsuzef), [@​ladygege](https://github.com/ladygege), [@​jacobtomlinson](https://github.com/jacobtomlinson), [@​fuller-stack-dev](https://github.com/fuller-stack-dev), and [@​shushushv](https://github.com/shushushv). - Cron: cancel active task runs cleanly, preserve terminal timeout/cancel state, and recover no-deliver tool warnings instead of silently losing the outcome. ([#​90666](openclaw/openclaw#90666), [#​90678](openclaw/openclaw#90678)) Thanks [@​ai-hpc](https://github.com/ai-hpc). - Gateway/config/auth: share the approval runtime socket token, replace arrays explicitly in `config.patch`, skip the deleted-agent guard only for valid ACP harness sessions, surface headless LaunchAgent state, verify SQLite auth migration before cleanup, and arm QMD startup maintenance. ([#​87105](openclaw/openclaw#87105), [#​91551](openclaw/openclaw#91551), [#​91219](openclaw/openclaw#91219), [#​91614](openclaw/openclaw#91614), [#​91740](openclaw/openclaw#91740), [#​91978](openclaw/openclaw#91978)) Thanks [@​fuller-stack-dev](https://github.com/fuller-stack-dev) and [@​scotthuang](https://github.com/scotthuang). - Providers/Codex: clarify quota errors, restore the Codex synthetic usage line, canonicalize Codex protocol assets, require API-key auth for realtime voice, normalize ACP model refs, preserve Gemma 4 `reasoning_content`, and avoid guardian review for local models. ([#​91390](openclaw/openclaw#91390), [#​91709](openclaw/openclaw#91709), [#​91507](openclaw/openclaw#91507), [#​91567](openclaw/openclaw#91567), [#​88630](openclaw/openclaw#88630), [#​91696](openclaw/openclaw#91696)) Thanks [@​hxy91819](https://github.com/hxy91819), [@​brokemac79](https://github.com/brokemac79), [@​RomneyDa](https://github.com/RomneyDa), [@​joshavant](https://github.com/joshavant), and [@​Coder-Wangyankun](https://github.com/Coder-Wangyankun). - Updates/builds: recover package Gateway restarts after refresh failure, expose plugin convergence repair, fall back to Corepack in PATH-less pnpm environments, seed the correct Docker store packages, and keep ClawHub dry-run and publish paths reusable. ([#​91581](openclaw/openclaw#91581), [#​91599](openclaw/openclaw#91599), [#​91547](openclaw/openclaw#91547), [#​91591](openclaw/openclaw#91591)) Thanks [@​fuller-stack-dev](https://github.com/fuller-stack-dev), [@​sallyom](https://github.com/sallyom), and [@​Patrick-Erichsen](https://github.com/Patrick-Erichsen). - UI: require explicit user intent before opening chat sessions and drain restored chat queues after session switches. ([#​91480](openclaw/openclaw#91480)) Thanks [@​TurboTheTurtle](https://github.com/TurboTheTurtle). - Android: avoid the `dataSync` foreground-service type for persistent nodes. ([#​80082](openclaw/openclaw#80082)) Thanks [@​davelutztx](https://github.com/davelutztx). - Native hooks: bound relay lifetimes so abandoned native hook connections cannot linger indefinitely. ([#​91550](openclaw/openclaw#91550)) Thanks [@​joshavant](https://github.com/joshavant). </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about these updates again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4xMDEuMSIsInVwZGF0ZWRJblZlciI6IjQzLjEwMS4xIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJyZW5vdmF0ZS9jb250YWluZXIiLCJ0eXBlL3BhdGNoIl19--> Reviewed-on: https://git.erwanleboucher.dev/eleboucher/homelab/pulls/1040
Summary
thread/compact/startonly as a bounded secondary follow-up after successful owning context-engine compaction/newstale model cleanup, and the real Codex app-server bridge pathRefs #90496. This addresses the live-reproduced context-engine/native-compaction breakage from the issue and adds direct regression coverage for two adjacent issue symptoms. It does not claim to close every reported symptom around real production Codex provider 4xxs, scheduled-session metadata recreation, or automatic transcript rotation policy.
Real behavior proof
Behavior addressed: Budget auto-compaction can be short-circuited by the Codex app-server native path before an owning context engine compacts the session.
Real environment tested: Final-head local runtime proof on
b5db02fd5a1cd673a17b86cc92eef10792a2b82a, using the queued compaction entry point plus a proof harness that calls the real Codex app-server compact bridge with a fake client at the socket boundary. Live Discord lane was also tested with a Convex-leaseddiscordcredential and gateway started from this branch.Exact steps or command run after this patch:
pnpm tsx /private/tmp/openclaw-90496-final-head-proof.ts. Live Discord proof was run with 1Password-injected Convex broker credentials:op-with-service-account run --env-file /private/tmp/openclaw-90496-op.env -- pnpm tsx /private/tmp/openclaw-90496-discord-proof.ts.Evidence after fix: Final-head proof output showed queued context-engine compaction succeeded and the secondary Codex bridge sent
thread/compact/startwithrequest:"after_context_engine":{ "sha": "b5db02fd5a1cd673a17b86cc92eef10792a2b82a", "compactResult": { "ok": true, "compacted": true, "reason": "proof-context-engine-compacted", "result": { "details": { "engine": "proof-context-engine", "nativeHarnessCompaction": { "ok": true, "compacted": false, "result": { "details": { "backend": "codex-app-server", "signal": "thread/compact/start", "pending": true, "request": "after_context_engine", "trigger": "budget" } } } } } }, "bridgeRequests": [ { "method": "thread/compact/start", "params": { "threadId": "thread-proof-final-head" } } ], "bindingAfter": { "contextEngine": { "schemaVersion": 1, "engineId": "proof-context-engine", "policyFingerprint": "proof-policy" } }, "assertions": { "contextEngineCompacted": true, "contextEngineMarkerWritten": true, "secondaryCodexResultOk": true, "secondaryCodexRequestMarked": true, "secondaryCodexBridgeSignal": true, "outboundBridgeRequest": true, "projectionCleared": true, "didNotTakeOldNonManualSkip": true } }Observed result after fix: The owning context engine compacts first. Only after that succeeds, the secondary Codex bridge sends
thread/compact/startwithrequest:"after_context_engine", clears the stale context-engine projection marker, and does not take the oldreason:"non_manual_trigger"skip path.What was not tested live: A real production Codex provider
provider_error_4xx, scheduled/reminder stale model metadata recreation, and a new automatic transcript-rotation product policy for genuinely unrecoverable compaction.Added Issue-Shape Regression Proof
src/agents/embedded-agent-runner/compact.hooks.test.tsnow covers a secondary Codex/nativeprovider_error_4xxafter successful context-engine compaction. The primary result remainsok:true,compacted:true, and the 4xx is nested undercodexNativeCompactioninstead of overriding the context-engine result.src/auto-reply/reply/session.test.tsnow covers the exact Discord channel shape from Discord channel remains trapped in oversized session after /new; compaction fails provider_error_4xx and model drifts from codex/gpt-5.5 to gpt-5.4 #90496: a stalecodex/gpt-5.4auto fallback fromopenai/gpt-5.5, oversized token counters, and/new. The reset creates a new session and clears provider/model fallback provenance plus stale token counters so the configuredcodex/gpt-5.5default can apply.Live Discord proof
The live Discord proof on this branch leased a
discordcredential from Convex, sent a real marker mention, observed the SUT bot reply, reset a seeded stale session entry, and invoked queued budget compaction for the Discord channel session.Redacted live output:
{ "credential": { "kind": "discord", "source": "convex", "id": "<redacted>" }, "discord": { "guildId": "<redacted>", "channelId": "<redacted>", "sentMessageId": "<redacted>", "replyMessageId": "<redacted>", "markerObserved": true }, "reset": { "rpcOk": true, "before": { "providerOverride": "codex", "modelOverride": "gpt-5.4", "fallbackOriginProvider": "openai", "fallbackOriginModel": "gpt-5.5" }, "after": {} }, "compaction": { "result": { "ok": true, "compacted": true, "reason": "proof-context-engine-compacted" }, "markerWritten": true, "markerEvents": [ { "sessionKey": "agent:qa:discord:channel:<redacted>", "target": "budget", "force": false } ] } }Regression proof
On current
origin/mainat80f1ae6ffe, the same live Discord proof reproduced the narrowed failure: Discord marker observed, reset clean, but budget compaction returnedcompacted:false,reason:"codex app-server owns automatic compaction", andmarkerWritten:false.Verification
.agents/skills/autoreview/scripts/autoreview --mode localpassed with no accepted/actionable findings after the added 90496 edge-case tests.node scripts/run-vitest.mjs src/agents/embedded-agent-runner/compact.hooks.test.ts src/auto-reply/reply/session.test.tspassed, 69/69 agent tests and 104/104 auto-reply session tests.node scripts/run-vitest.mjs extensions/codex/src/app-server/compact.test.tspassed, 19/19 Codex bridge tests.pnpm tsx /private/tmp/openclaw-90496-final-head-proof.tspassed onb5db02fd5a1cd673a17b86cc92eef10792a2b82a.discordcredential.git diff --checkpassed.