You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[Bug] Orphaned/oversized native Codex thread wedges a session permanently — chat.send returns started but no run executes, silently dropping messages #86963
[Bug] Orphaned/oversized native Codex thread wedges a session permanently — chat.send returns "started" but no run executes, silently dropping every message
OpenClaw version: 2026.5.22 (a374c3a) Component: Agents/Codex · Codex app-server native threads · context compaction Severity: High — silent, permanent message loss on an affected session (no error surfaced)
Summary
A Codex (openai-codex / gpt-5.5) WebChat session became permanently wedged: it reports status=done, chat.send returns "started", lastInteractionAt advances — but no agent run ever executes, nothing is written to the transcript, and the user gets no reply. Every subsequent message is silently dropped. The session sits at ~212k tokens with compactionCount=0.
The existing Codex compaction-recovery mechanisms do not clear it:
The native Codex app-server restarts on dispatch (observed: a fresh app-server process spawned at the exact moment of a new send) yet the very next turn still stalls — so the "restart the native app-server and retry once when server-side compaction times out" path (Recover stuck Codex compaction #85500) loads the same oversized thread and stalls again.
A full gateway restart does not help (the oversized native thread is persisted in codex-home and restored on resume).
Timeline / how it got into this state
The session's last successful turn ended while the agent was mid-work during a gateway reload (the assistant's final transcript line was literally "I got stuck at the gateway reload step … recovered"). After that point the native Codex thread appears orphaned/oversized, and the session never runs another turn.
Reproduction signature (from gateway log)
chat.send "started" (runId …) # accepted in ~200-340ms
codex plugin thread config eligibility # thread check runs
<then nothing — no run, no compaction completion, no transcript write, no error>
webchat disconnect (user reloads)
openclaw tasks audit does not flag this (there is no stuck task — it's a thread/compaction-layer stall). The only visible signature is in the session registry: status=done with lastInteractionAt advancing far past endedAt while compactionCount stays 0.
Expected
A session must never silently drop messages. Either:
preflight compaction on an oversized native thread must succeed or force-rotate the thread (start a fresh native thread seeded from a summary) so the turn can proceed, or
if the thread cannot be made runnable, surface a visible error to the user (and ideally to tasks audit / health) rather than accepting chat.send and silently running nothing.
Notes
There is no targeted CLI to reset a single session's native Codex thread (openclaw sessions offers only cleanup/export-trajectory/list); openclaw doctor --fix only restarts the gateway and does not clear the thread. The only practical recovery is abandoning the session.
[Bug] Orphaned/oversized native Codex thread wedges a session permanently —
chat.sendreturns "started" but no run executes, silently dropping every messageOpenClaw version: 2026.5.22 (
a374c3a)Component: Agents/Codex · Codex app-server native threads · context compaction
Severity: High — silent, permanent message loss on an affected session (no error surfaced)
Summary
A Codex (
openai-codex/gpt-5.5) WebChat session became permanently wedged: it reportsstatus=done,chat.sendreturns"started",lastInteractionAtadvances — but no agent run ever executes, nothing is written to the transcript, and the user gets no reply. Every subsequent message is silently dropped. The session sits at ~212k tokens withcompactionCount=0.The existing Codex compaction-recovery mechanisms do not clear it:
codex-homeand restored on resume).Timeline / how it got into this state
The session's last successful turn ended while the agent was mid-work during a gateway reload (the assistant's final transcript line was literally "I got stuck at the gateway reload step … recovered"). After that point the native Codex thread appears orphaned/oversized, and the session never runs another turn.
Reproduction signature (from gateway log)
openclaw tasks auditdoes not flag this (there is no stuck task — it's a thread/compaction-layer stall). The only visible signature is in the session registry:status=donewithlastInteractionAtadvancing far pastendedAtwhilecompactionCountstays 0.Expected
A session must never silently drop messages. Either:
tasks audit/ health) rather than acceptingchat.sendand silently running nothing.Notes
openclaw sessionsoffers onlycleanup/export-trajectory/list);openclaw doctor --fixonly restarts the gateway and does not clear the thread. The only practical recovery is abandoning the session.