-
-
Notifications
You must be signed in to change notification settings - Fork 79.1k
Regression: preflight compaction still surfaces missing Codex thread failure after #86602 #87736
Copy link
Copy link
Closed
Labels
P1High-priority user-facing bug, regression, or broken workflow.High-priority user-facing bug, regression, or broken workflow.clawsweeper:linked-pr-openClawSweeper found an open linked pull request for this issue.ClawSweeper found an open linked pull request for this issue.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.ClawSweeper does not recommend queueing a new automated fix PR for this issue.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.ClawSweeper found a high-confidence source-level issue reproduction.impact:message-lossChannel message delivery can be lost, duplicated, or misrouted.Channel message delivery can be lost, duplicated, or misrouted.impact:session-stateSession, memory, transcript, context, or agent state can drift or corrupt.Session, memory, transcript, context, or agent state can drift or corrupt.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.Very strong issue quality with high-confidence source-level or clear reproduction.
Metadata
Metadata
Assignees
Labels
P1High-priority user-facing bug, regression, or broken workflow.High-priority user-facing bug, regression, or broken workflow.clawsweeper:linked-pr-openClawSweeper found an open linked pull request for this issue.ClawSweeper found an open linked pull request for this issue.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.ClawSweeper does not recommend queueing a new automated fix PR for this issue.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.ClawSweeper found a high-confidence source-level issue reproduction.impact:message-lossChannel message delivery can be lost, duplicated, or misrouted.Channel message delivery can be lost, duplicated, or misrouted.impact:session-stateSession, memory, transcript, context, or agent state can drift or corrupt.Session, memory, transcript, context, or agent state can drift or corrupt.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.Very strong issue quality with high-confidence source-level or clear reproduction.
Type
Fields
Give feedbackNo fields configured for issues without a type.
Summary
The missing Codex thread preflight-compaction failure from #86211 reproduced again on OpenClaw
2026.5.27, after #86211 was closed as fixed by #86602.A Telegram group inbound was accepted, but dispatch failed before a normal assistant turn. Telegram surfaced the generic user-facing fallback:
No private chat ids, message ids, session ids, bot handles, or raw Codex thread ids are included here.
Environment
2026.5.27Expected Behavior
When a Codex app-server preflight compaction thread is stale/missing, OpenClaw should classify it as recoverable missing/stale binding state and continue into the existing recovery/fresh-thread path. Telegram users should not see the generic failure for this condition.
Actual Behavior
Gateway logs showed dispatch failing before a normal assistant turn. First, preflight compaction timed out waiting on the Codex app-server compaction thread; immediately afterward, a subsequent inbound failed with raw
thread not foundfor the same redacted Codex thread id.Redacted Runtime Log Excerpt
This is the relevant sequence from the gateway journal. Private chat id, message ids, host, and Codex thread id are redacted.
Why this still looks related to #86211 / #86602
#86211 and #86602 are the direct lineage for missing/stale Codex thread recovery during preflight compaction.
Current source recovers structured missing/stale binding results such as:
failure.reason=stale_thread_bindingfailure.reason=missing_thread_bindingThis recurrence suggests there is still a path where a raw
thread not foundcompaction result reaches the preflight failure throw without being classified as recoverable, or where recovery happens only after one or more user-visible failed dispatches.The raw
thread not foundappears immediately after atimed out waiting for codex app-server compactionfailure for the same redacted Codex thread id. That suggests the remaining producer may be the timeout/retry boundary around Codex app-server compaction, not the direct Codex app-serverthread not foundpath already normalized by #86602.After-Patch Local Proof for PR #87738
I ran the patched runtime helper locally against a redacted Telegram-group session entry and a raw compaction result shaped as:
Observed output:
{ "proof": "raw thread-not-found preflight compaction is recoverable", "returnedSameSessionEntry": true, "compactCalls": 1, "incrementCalls": 0, "threw": false }What is still not proven: a live Telegram run against a production gateway with PR #87738 installed.
Related