Bug: embedded_run response delivery deadlock in codex-app-server path
Problem Summary
Mikhail (agent ID: mikhail, runtime: acp with codex backend in acpx mode) consistently fails to deliver responses to Discord and Telegram. The model generates output successfully (confirmed by stopReason: "stop" and token usage counts), but the final response assembly and channel delivery never occurs. Sessions timeout at ~616 seconds with status: "interrupted" or status: "timeout".
Symptoms
- Model runs fine -
gpt-5.5 via OpenAI Responses API generates complete output with stopReason: "stop" and valid usage counts
- Delivery never happens - No message appears in Discord/Telegram despite model completion
- Session times out -
status: "timeout or status: "interrupted after ~616 seconds
- Stalled detection fires - OpenClaw diagnostic logs show
stalled_agent_run classification with terminalProgressStale=true recovery=none
Diagnostic Evidence
Discord session:
lastProgress=codex_app_server:notification:thread/tokenUsage/updated — tokens counted, model running
- Later:
lastProgress=codex_app_server:notification:rawResponseItem/completed — model finished
activeWorkKind=embedded_run
recovery=none — OpenClaw knows it cannot self-heal
Telegram session:
lastProgress=codex_app_server:notification:rawResponseItem/completed — model finished, response ready
terminalProgressStale=true recovery=none
- Same embedded_run path, same deadlock
Configuration
Mikhail is correctly configured:
- Discord binding: routes
channel=discord, accountId=mikhail to agent mikhail ✓
- Discord token: present in secrets via SecretRef ✓
- ACP runtime:
backend: "acpx", mode: "persistent ✓
- Plugin codex:
enabled: true, appServer.mode: "yolo", appServer.transport: "stdio ✓
- Gateway: running on loopback port 18789,
Read probe: ok ✓
Root Cause Hypothesis
The Codex app server (codex app-server) successfully processes the request, the model generates a complete response, and rawResponseItem/completed fires correctly. However, the embedded_run handler in OpenClaw never receives or processes the completed response event. The delivery pipeline from Codex → OpenClaw → Discord/Telegram is broken at the final step.
Environment
- Platform: macOS Darwin 25.5.0 (x64)
- Node: v22.22.0 (OpenClaw managed)
- OpenClaw: 2026.5.12
- Gateway: running via LaunchAgent
- Codex app server: running as separate process
Expected Behavior
Simple question like "@mikhail tell me your model and model version" should complete and deliver in under 30 seconds, not timeout after 10 minutes with a completed model response sitting undelivered.
Bug: embedded_run response delivery deadlock in codex-app-server path
Problem Summary
Mikhail (agent ID:
mikhail, runtime:acpwithcodexbackend inacpxmode) consistently fails to deliver responses to Discord and Telegram. The model generates output successfully (confirmed bystopReason: "stop"and token usage counts), but the final response assembly and channel delivery never occurs. Sessions timeout at ~616 seconds withstatus: "interrupted"orstatus: "timeout".Symptoms
gpt-5.5via OpenAI Responses API generates complete output withstopReason: "stop"and validusagecountsstatus: "timeoutorstatus: "interruptedafter ~616 secondsstalled_agent_runclassification withterminalProgressStale=true recovery=noneDiagnostic Evidence
Discord session:
lastProgress=codex_app_server:notification:thread/tokenUsage/updated— tokens counted, model runninglastProgress=codex_app_server:notification:rawResponseItem/completed— model finishedactiveWorkKind=embedded_runrecovery=none— OpenClaw knows it cannot self-healTelegram session:
lastProgress=codex_app_server:notification:rawResponseItem/completed— model finished, response readyterminalProgressStale=true recovery=noneConfiguration
Mikhail is correctly configured:
channel=discord, accountId=mikhailto agentmikhail✓backend: "acpx",mode: "persistent✓enabled: true,appServer.mode: "yolo",appServer.transport: "stdio✓Read probe: ok✓Root Cause Hypothesis
The Codex app server (
codex app-server) successfully processes the request, the model generates a complete response, andrawResponseItem/completedfires correctly. However, the embedded_run handler in OpenClaw never receives or processes the completed response event. The delivery pipeline from Codex → OpenClaw → Discord/Telegram is broken at the final step.Environment
Expected Behavior
Simple question like "@mikhail tell me your model and model version" should complete and deliver in under 30 seconds, not timeout after 10 minutes with a completed model response sitting undelivered.