Bug type
Crash (process/app exits or hangs)
Beta release blocker
Yes
Summary
The in-process codex app-server plugin silently drops turns after 1–4 successful interactions. The OpenAI Responses API sends a notification:item/completed event but the turn never resolves. The gateway's Node.js event loop reaches 100% utilization with P99 delays exceeding 5 seconds (worst case 95 seconds) during codex turns, consistent with accumulated unclosed I/O (likely SSE response streams) choking the event loop until the 60-second idle timeout fires. Reproduced 9 times across 7 sessions on 2026.5.26 and 2026.5.25-beta.1. Non-codex models (PI runtime) through the same gateway never exhibit this. cc @Peetiegonzalez
Steps to reproduce
- Start gateway normally (
openclaw gateway), 5 plugins loaded (brave, codex, discord, lossless-claw, memory-core)
- Select
openai/gpt-5.5 (codex runtime, agentRuntime.id: "codex")
/reset, /new to start a fresh session (any channel: Discord, GUI, TUI, or cron)
- Send a warm-up message — succeeds, model responds in 2–5s server-side
- Send a task requiring tool calls (e.g. multi-step file reads, or any multi-tool task)
- Model begins working (tool calls execute,
item/started and item/completed notifications arrive)
- Model goes silent mid-turn — no further SSE notifications arrive
- After exactly 60 seconds, idle timeout fires:
codex app-server turn idle timed out waiting for completion
- Auth profile is incorrectly marked as failed (
auth_profile_failure_state_updated, reason: timeout)
Reliability: 9/9 unique codex threads timed out in a single day. Some completed 1–3 turns before failing; none survived more than ~4 turns of tool-call-heavy work.
Expected behavior
Codex turns should complete reliably. The standalone Codex CLI using the same binary (0.130.0), same OAuth credentials, and same model (GPT-5.5) responds in 2.5–3s with zero drops. The in-process codex plugin should match this reliability.
Actual behavior
Every codex session eventually times out with the same signature:
{
"idleMs": 60001,
"timeoutMs": 60000,
"lastActivityReason": "notification:item/completed",
"lastNotificationMethod": "item/completed"
}
The plugin receives item/completed (partial — e.g. for a tool call item) then goes completely idle. No response.completed or turn-level completion event arrives. After 60s silence, the idle timeout fires and the turn is abandoned.
OpenClaw version
2026.5.26 (6f57286) — also reproduced on 2026.5.25-beta.1 (bbf4117)
Operating system
macOS Darwin 24.6.0, x86_64, Intel, 16GB RAM
Install method
openclaw update --channel dev (git clone + build), then sudo npm install -g /Users/admin/openclaw
Model
openai/gpt-5.5
Provider / routing chain
openclaw → codex plugin (in-process app-server) → OpenAI Responses API (Codex subscription OAuth)
Additional provider/model setup details
agentRuntime.id: "codex" on the gpt-5.5 model entry (stock default config)
- Provider shows as
openai-codex in session logs, API as openai-codex-responses
- Context engine:
lossless-claw (LCM)
- Tested with both
@openai/codex binary 0.133.0 (bundled) and 0.130.0 (from standalone CLI) — same failure on both, ruling out binary version regression
- Non-OpenAI models use PI runtime and never exhibit this issue through the same gateway
Logs, screenshots, and evidence
1. Event loop saturation — gateway liveness diagnostics during codex turns:
These are the gateway's own liveness warning entries. Every entry below was logged during an active codex turn. PI runtime turns on the same gateway never trigger liveness warnings.
Timestamp (GMT+8) P99 Delay Max Delay Utilization Active Work
06:08:05 7,432ms 7,432ms 99.7% codex embedded_run (age=432s)
06:10:28 10,981ms 10,981ms 100% codex embedded_run (age=574s)
06:12:38 12,147ms 12,147ms 100% codex tool_call
06:14:50 17,767ms 17,767ms 99.9% codex model_call
09:00:13 95,429ms 95,429ms 100% codex embedded_run (age=120s)
09:02:37 16,945ms 16,945ms 100% codex embedded_run (age=264s)
09:11:40 34,393ms 34,393ms 100% codex embedded_run (age=186s)
09:30:39 16,853ms 75,967ms 98.2% codex embedded_run
13:45:23 2,027ms 4,182ms 77.2% codex turn:start
15:54:44 5,516ms 5,516ms 100% codex item/started
Worst case: event loop blocked for 95 seconds straight (09:00:13). At that point no I/O can be processed, no SSE events can be read, and the idle timeout is the only escape.
2. TCP/fd accumulation — external process monitoring (5-second sample interval):
Gateway process (PID 13064) monitored during a GPT-5.5 tool-bench-2 run:
Time RSS(MB) CPU% FDs TCP Established Unix Phase
23:03:35 656 1 85 5 3 1 idle baseline
23:04:07 656 104 85 5 3 1 turn starts
23:04:14 711 54 91 7 4 4 +6 fds, +3 unix
23:05:07 648 101 92 9 6 4 tcp climbing
23:05:13 705 8 93 10 7 4 PEAK: 10 TCP, 7 established
23:05:38 714 7 93 10 7 4 peak again
23:06:23 665 187 91 8 5 4 CPU 187%
23:07:40 661 103 86 6 4 1 timeout fires, cleanup
23:07:53 680 0.6 84 4 2 1 back to baseline
TCP established connections climb from baseline 3 → peak 7 during turns. Unix fds jump 1 → 4. After the timeout fires and cleanup runs, everything drops back to baseline. Connections accumulate during active turns and don't fully drain between tool call rounds.
3. All 9 timeouts on 2026-05-26:
# Time(GMT+8) Thread(prefix) Session(prefix) Channel Turns before timeout
1 12:05:29 019e626b 36e08c7c Discord 3 successful
2 13:46:53 019e62d0 3c7d931e GUI dashboard ~5 successful
3 14:10:00 019e62df 3c7d931e GUI dashboard ~7 successful
4 16:01:47 019e634c 98276c33 Cron job 0 (first turn)
5 16:09:57 019e6346 a793f936 Discord multiple
6 19:21:40 019e6400 0fba2bb4 Discord 3 successful
7 21:58:39 019e648f 57306d96 Discord ~8 successful
8 22:46:34 019e64be 571f2479 Discord 1 successful
9 23:07:37 019e64d1 0e065b71 Discord 2 successful
All 9 have identical lastActivityReason: "notification:item/completed" signature.
4. Codex app-server runs in-process (no child process):
$ pgrep -P <gateway_pid> # no codex child
$ ps -p <gateway_pid> # only the node gateway process
Resource leaks in the codex plugin directly affect the gateway's event loop.
5. Binary version ruled out:
Swapped bundled @openai/codex 0.133.0 → 0.130.0 (from standalone CLI, known working). Gateway restarted. Same timeouts on 0.130.0 (timeout #8 and #9 above occurred after the swap). Standalone Codex CLI at 0.130.0 works perfectly — 2.5–3s responses, zero drops.
Impact and severity
- Affected: All users relying on OpenAI models via the codex runtime (the default for
openai/* models). All channels affected: Discord, GUI webchat/dashboard, TUI, cron jobs.
- Severity: High — blocks any sustained work with GPT-5.5. Sessions fail within 1–4 tool-call turns. The agent goes silent with no user-visible error (silent failure).
- Frequency: 9/9 codex threads timed out in one day of testing. 52 thread bindings total, 9 timed out = ~17% failure rate per thread binding, but since failures tend to cluster after a few successful turns, the per-session failure rate is effectively 100%.
- Consequence: GPT-5.5 is unusable for any multi-turn or tool-call-heavy work. Users must switch to non-codex models as a workaround. Auth profile is falsely marked as failed after each timeout, potentially cascading into failover routing issues.
Additional information
Last known good version: 2026.5.12 (bundled @openai/codex 0.130.0) — had codex overhead but no turn drops.
First known bad version: 2026.5.25-beta.1 — also present on 2026.5.26 (main).
Hypothesis — SSE stream leak:
The codex plugin opens SSE connections to the OpenAI Responses API for each turn. Evidence suggests these streams are not properly closed after turn completion: TCP established connections climb during turns (3→7), event loop utilization hits 100%, and accumulated pending I/O handlers prevent new SSE events from being processed. The item/completed notification gets through (last thing processed before saturation) but response.completed never arrives. After the 60s idle timeout fires and cleanup runs, all metrics return to baseline — confirming the cleanup works but is triggered too late.
Secondary bug — auth profile poisoning:
Each timeout marks the auth profile as failed (auth_profile_failure_state_updated, reason: timeout). This is a false positive — auth succeeded, the thread was bound, tool calls executed. A turn timeout is not an auth failure.
Workaround:
Switch to any non-codex model. All non-OpenAI models use the PI runtime, bypass the codex plugin entirely, and work reliably through the same gateway pipeline:
{ "model": { "primary": "opencode-go/deepseek-v4-pro", "fallbacks": ["fireworks-ai/.../kimi-k2p6-turbo", "xai/grok-4.3"] } }
Note: Model-scoped agentRuntime.id: "pi" does NOT work for openai/* models — the codex plugin auto-claims the turn regardless of the policy. This was tested and reverted.
Data files (available on request):
- Gateway log:
/tmp/openclaw/openclaw-2026-05-26.log
- Process monitor CSV (5s sample):
codex-monitor-20260526-230334.log
- Response time data (32+ message pairs):
response-time-data-20260526.md
- 7 session JSONL files with full message/tool/usage data
Bug type
Crash (process/app exits or hangs)
Beta release blocker
Yes
Summary
The in-process codex app-server plugin silently drops turns after 1–4 successful interactions. The OpenAI Responses API sends a
notification:item/completedevent but the turn never resolves. The gateway's Node.js event loop reaches 100% utilization with P99 delays exceeding 5 seconds (worst case 95 seconds) during codex turns, consistent with accumulated unclosed I/O (likely SSE response streams) choking the event loop until the 60-second idle timeout fires. Reproduced 9 times across 7 sessions on 2026.5.26 and 2026.5.25-beta.1. Non-codex models (PI runtime) through the same gateway never exhibit this. cc @PeetiegonzalezSteps to reproduce
openclaw gateway), 5 plugins loaded (brave, codex, discord, lossless-claw, memory-core)openai/gpt-5.5(codex runtime,agentRuntime.id: "codex")/reset,/newto start a fresh session (any channel: Discord, GUI, TUI, or cron)item/startedanditem/completednotifications arrive)codex app-server turn idle timed out waiting for completionauth_profile_failure_state_updated, reason:timeout)Reliability: 9/9 unique codex threads timed out in a single day. Some completed 1–3 turns before failing; none survived more than ~4 turns of tool-call-heavy work.
Expected behavior
Codex turns should complete reliably. The standalone Codex CLI using the same binary (0.130.0), same OAuth credentials, and same model (GPT-5.5) responds in 2.5–3s with zero drops. The in-process codex plugin should match this reliability.
Actual behavior
Every codex session eventually times out with the same signature:
{ "idleMs": 60001, "timeoutMs": 60000, "lastActivityReason": "notification:item/completed", "lastNotificationMethod": "item/completed" }The plugin receives
item/completed(partial — e.g. for a tool call item) then goes completely idle. Noresponse.completedor turn-level completion event arrives. After 60s silence, the idle timeout fires and the turn is abandoned.OpenClaw version
2026.5.26 (6f57286) — also reproduced on 2026.5.25-beta.1 (bbf4117)
Operating system
macOS Darwin 24.6.0, x86_64, Intel, 16GB RAM
Install method
openclaw update --channel dev(git clone + build), thensudo npm install -g /Users/admin/openclawModel
openai/gpt-5.5
Provider / routing chain
openclaw → codex plugin (in-process app-server) → OpenAI Responses API (Codex subscription OAuth)
Additional provider/model setup details
agentRuntime.id: "codex"on the gpt-5.5 model entry (stock default config)openai-codexin session logs, API asopenai-codex-responseslossless-claw(LCM)@openai/codexbinary 0.133.0 (bundled) and 0.130.0 (from standalone CLI) — same failure on both, ruling out binary version regressionLogs, screenshots, and evidence
1. Event loop saturation — gateway liveness diagnostics during codex turns:
These are the gateway's own
liveness warningentries. Every entry below was logged during an active codex turn. PI runtime turns on the same gateway never trigger liveness warnings.Worst case: event loop blocked for 95 seconds straight (09:00:13). At that point no I/O can be processed, no SSE events can be read, and the idle timeout is the only escape.
2. TCP/fd accumulation — external process monitoring (5-second sample interval):
Gateway process (PID 13064) monitored during a GPT-5.5 tool-bench-2 run:
TCP established connections climb from baseline 3 → peak 7 during turns. Unix fds jump 1 → 4. After the timeout fires and cleanup runs, everything drops back to baseline. Connections accumulate during active turns and don't fully drain between tool call rounds.
3. All 9 timeouts on 2026-05-26:
All 9 have identical
lastActivityReason: "notification:item/completed"signature.4. Codex app-server runs in-process (no child process):
Resource leaks in the codex plugin directly affect the gateway's event loop.
5. Binary version ruled out:
Swapped bundled
@openai/codex0.133.0 → 0.130.0 (from standalone CLI, known working). Gateway restarted. Same timeouts on 0.130.0 (timeout #8 and #9 above occurred after the swap). Standalone Codex CLI at 0.130.0 works perfectly — 2.5–3s responses, zero drops.Impact and severity
openai/*models). All channels affected: Discord, GUI webchat/dashboard, TUI, cron jobs.Additional information
Last known good version: 2026.5.12 (bundled
@openai/codex0.130.0) — had codex overhead but no turn drops.First known bad version: 2026.5.25-beta.1 — also present on 2026.5.26 (main).
Hypothesis — SSE stream leak:
The codex plugin opens SSE connections to the OpenAI Responses API for each turn. Evidence suggests these streams are not properly closed after turn completion: TCP established connections climb during turns (3→7), event loop utilization hits 100%, and accumulated pending I/O handlers prevent new SSE events from being processed. The
item/completednotification gets through (last thing processed before saturation) butresponse.completednever arrives. After the 60s idle timeout fires and cleanup runs, all metrics return to baseline — confirming the cleanup works but is triggered too late.Secondary bug — auth profile poisoning:
Each timeout marks the auth profile as failed (
auth_profile_failure_state_updated, reason:timeout). This is a false positive — auth succeeded, the thread was bound, tool calls executed. A turn timeout is not an auth failure.Workaround:
Switch to any non-codex model. All non-OpenAI models use the PI runtime, bypass the codex plugin entirely, and work reliably through the same gateway pipeline:
{ "model": { "primary": "opencode-go/deepseek-v4-pro", "fallbacks": ["fireworks-ai/.../kimi-k2p6-turbo", "xai/grok-4.3"] } }Note: Model-scoped
agentRuntime.id: "pi"does NOT work foropenai/*models — the codex plugin auto-claims the turn regardless of the policy. This was tested and reverted.Data files (available on request):
/tmp/openclaw/openclaw-2026-05-26.logcodex-monitor-20260526-230334.logresponse-time-data-20260526.md