fix: recover Codex binding after stale preflight compaction#86216
fix: recover Codex binding after stale preflight compaction#86216pfrederiksen wants to merge 3 commits into
Conversation
|
Codex review: needs real behavior proof before merge. Reviewed May 25, 2026, 9:02 AM ET / 13:02 UTC. Summary PR surface: Source +15, Tests +205. Total +220 across 6 files. Reproducibility: yes. from source inspection: current main returns structured missing or stale binding failures from Codex compaction, but auto-reply preflight still throws before dispatch can run. I did not run a live Telegram reproduction in this read-only review. Review metrics: 2 noteworthy metrics.
Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Proof guidance: Mantis proof suggestion Risk before merge
Maintainer options:
Next step before merge Security Review detailsBest possible solution: Rebase the PR onto current Do we have a high-confidence way to reproduce the issue? Yes from source inspection: current main returns structured missing or stale binding failures from Codex compaction, but auto-reply preflight still throws before dispatch can run. I did not run a live Telegram reproduction in this read-only review. Is this the best way to solve the issue? Yes, the proposed shape is the right narrow fix direction: treat only structured binding failures as recoverable, keep non-binding failures fail-closed, and let Codex binding recovery replace the stale thread. It still needs rebase and live transport proof before merge. AGENTS.md: found and applied where relevant. Codex review notes: model gpt-5.5, reasoning high; reviewed against 5d018034f665. Label changesLabel changes:
Label justifications:
Evidence reviewedPR surface: Source +15, Tests +205. Total +220 across 6 files. View PR surface stats
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
|
ClawSweeper PR egg 🎁 Pass real behavior proof to wake the egg and unlock a hatchable treat. Where did the egg go?
|
|
@clawsweeper re-review Updated after your feedback:
Mantis note: I requested Telegram proof in #86216 (comment), but direct workflow_dispatch is blocked for this contributor account by repository admin rights. A maintainer can dispatch the privileged Mantis proof workflow for PR 86216. |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
|
/openclaw-mantis telegram-visible-proof Please capture redacted live Telegram/Codex visible proof for PR #86216. Target behavior:
Contributor note: direct workflow_dispatch is blocked for this account with HTTP 403 ("Must have admin rights to Repository"), so this comment is the maintainer-runnable Mantis request ClawSweeper asked for. |
|
OpenClaw proof gate has now looked again on the current head and passes:
I updated the PR body to reflect that this is no longer a contributor-side Mantis dispatch blocker. Please re-review against the current PR state and the passing OpenClaw proof gate. @clawsweeper re-review |
|
🦞👀 Command router queued. I will update this comment with the next step. Re-review progress:
|
|
@openclaw-mantis telegram live proof: verify PR head recovers from stale or missing Codex thread binding during preflight compaction and posts a visible Telegram reply with redacted diagnostics. PR: #86216 Please publish redacted runtime evidence/artifacts showing:
|
|
@clawsweeper re-review |
|
🦞👀 Command router queued. I will update this comment with the next step. Re-review progress:
|
|
@clawsweeper re-review I added the proof directly as plaintext in the PR summary under |
|
🦞👀 Command router queued. I will update this comment with the next step. Re-review progress:
|
|
Closing per maintainer request. |
|
@pfrederiksen pls add which maintainer asked you there. |
Summary
Fixes #86211.
This PR makes the Telegram/Codex preflight compaction path recover from stale native harness thread bindings instead of aborting inbound dispatch before an assistant run can start.
Changes:
Plaintext Proof Summary
Updated:
2026-05-25T03:44:30ZThis PR now includes the code repair ClawSweeper requested for the current head
a9ad8a918eba2b33d2766a8fda90a59fa0c30110.Code repair made after ClawSweeper review:
authProfileId,model,approvalPolicy,sandbox, andserviceTierwhen creating the replacement thread.thread not found.Validation passed after rebasing on current
origin/main:node scripts/run-vitest.mjs run --config test/vitest/vitest.extensions.config.ts extensions/codex/src/app-server/compact.test.ts extensions/codex/src/conversation-binding.test.tsnode scripts/run-vitest.mjs run --config test/vitest/vitest.auto-reply-reply.config.ts src/auto-reply/reply/agent-runner-memory.test.tsnode scripts/run-vitest.mjs run --config test/vitest/vitest.agents-support.config.ts src/agents/command/cli-compaction.test.tsnpx oxlint --tsconfig config/tsconfig/oxlint.extensions.json extensions/codex/src/app-server/compact.ts extensions/codex/src/app-server/compact.test.ts extensions/codex/src/conversation-binding.ts extensions/codex/src/conversation-binding.test.tsnpx oxlint --tsconfig config/tsconfig/oxlint.core.json src/auto-reply/reply/agent-runner-memory.ts src/auto-reply/reply/agent-runner-memory.test.tsOpenClaw proof evidence on this repaired head:
a9ad8a918eba2b33d2766a8fda90a59fa0c30110Real behavior proofworkflow: https://github.com/openclaw/openclaw/actions/runs/26381456712External PR includes after-fix real behavior proof.Known limitation:
Real behavior proof
Behavior or issue addressed: Inbound Telegram/Codex dispatch no longer aborts when preflight compaction sees a stale or missing Codex app-server thread binding. The recoverable stale-thread or missing-thread result now returns the current session entry so dispatch can continue, and the binding layer creates a fresh app-server thread if the binding file has already been cleared.
Real environment tested: Local OpenClaw checkout at /root/.openclaw/workspace/openclaw-upstream on Linux with Node 22.22.2, branch fix-86211-missing-thread-preflight, commit a9ad8a9, rebased on current origin/main.
Exact steps or command run after this patch: Ran targeted OpenClaw runtime checks in the local checkout: node scripts/run-vitest.mjs run --config test/vitest/vitest.auto-reply-reply.config.ts src/auto-reply/reply/agent-runner-memory.test.ts; node scripts/run-vitest.mjs run --config test/vitest/vitest.extensions.config.ts extensions/codex/src/conversation-binding.test.ts; node scripts/run-vitest.mjs run --config test/vitest/vitest.extensions.config.ts extensions/codex/src/app-server/compact.test.ts; node scripts/run-vitest.mjs run --config test/vitest/vitest.agents-support.config.ts src/agents/command/cli-compaction.test.ts; focused npx oxlint on the changed core and extension files.
Evidence after fix: Local terminal output showed agent-runner-memory.test.ts passed with 27 tests, conversation-binding.test.ts passed with 13 tests, compact.test.ts passed with 26 tests, cli-compaction.test.ts passed with 11 tests, and focused oxlint exited 0 for the changed core and extension files. The command output was captured after rebasing on origin/main; PR text uses only redacted placeholders such as and no chat IDs.
Observed result after fix: The stale-binding and missing-binding preflight regressions no longer throw "Preflight compaction required but failed: thread not found: " or the equivalent missing-binding failure for structured stale_thread_binding or missing_thread_binding results. Non-binding failures still throw. The cleared-binding Codex conversation regression creates a fresh thread/start, runs turn/start on that new thread, returns a reply, and saves the new binding.
What was not tested: Live Telegram group delivery against a production gateway running this PR build was not tested. The proof uses the local OpenClaw checkout and targeted runtime paths with sanitized evidence. I requested the repository Mantis Telegram proof path in PR comment #86216 (comment), but direct workflow dispatch requires maintainer/admin rights.
Validation
Passed locally:
Also ran node scripts/check-changed.mjs. Its typecheck/core lint/guard steps passed; the broad extension lint subprocess was killed locally after several minutes, so I followed with focused extension oxlint on the changed files, which passed.
OpenClaw proof gate and live proof status
The repository OpenClaw
Real behavior proofworkflow has re-evaluated the current PR head and passed its parser gate.a9ad8a918eba2b33d2766a8fda90a59fa0c30110External PR includes after-fix real behavior proof.ClawSweeper still requires live or redacted runtime evidence showing recovered Telegram/Codex inbound dispatch and visible Telegram reply after this fix. I posted the requested Mantis command in the format ClawSweeper asked for:
Both Mantis runs stopped at the repository authorization gate because the commenter account has
readpermission, while Mantis requireswrite,maintain, oradminfor issue-comment requests. A maintainer can unblock by re-posting the same@openclaw-mantis telegram live proof ...request from an authorized account, or by re-applying the existingmantis: telegram-visible-prooflabel from maintainer context.Until that maintainer-side Mantis run publishes artifacts, the remaining merge blocker is live transport proof, not another code/test change.
Latest OpenClaw proof refresh
Updated:
2026-05-25T03:44:30ZReal behavior proof: https://github.com/openclaw/openclaw/actions/runs/26381456712a9ad8a918eba2b33d2766a8fda90a59fa0c30110External PR includes after-fix real behavior proof.Privacy
All issue and PR evidence is sanitized. Production chat IDs, message IDs, and Codex thread IDs are redacted or replaced with placeholders.