Bug type
Behavior bug (incorrect output/state without crash)
Beta release blocker
No
Summary
Atlas subagent runs that hit context_overflow / livenessState: "blocked" were reported to the parent as completed successfully with no usable output.
Steps to reproduce
- Start OpenClaw 2026.5.6 with an Atlas subagent using openai-codex/gpt-5.5.
- Spawn an Atlas subagent task large enough to trigger embedded runtime context overflow.
- Observe gateway logs containing [context-overflow-diag].
- Observe parent/session completion reports completed successfully with no output artifact or useful error surfaced.
Expected behavior
When the embedded runtime reports livenessState: "blocked" and/or an error payload such as context_overflow, the subagent completion bridge should report the run as blocked/error and surface the error message to the parent.
Actual behavior
The embedded runtime logged context overflow, but the subagent completion path reported the run as completed successfully; the parent saw no usable output artifact/error.
OpenClaw version
OpenClaw 2026.5.6 (c97b9f7)
Operating system
Linux 6.17.0-23-generic x64
Install method
npm global
Model
openai-codex/gpt-5.5
Provider / routing chain
OpenClaw → openai-codex OAuth → GPT-5.5 Codex
Additional provider/model setup details
Atlas is configured as a fixed worker agent with agentId: atlas, primary model openai-codex/gpt-5.5, and no fallbacks. Atlas config showed contextTokens: 400000.
Logs, screenshots, and evidence
2026-05-11T22:00:06.776-05:00 [agent/embedded] [context-overflow-diag]
sessionKey=agent:atlas:subagent:d1491dac-bc19-4a6b-a47c-e8ec45d9ab30
provider=openai-codex/gpt-5.5
source=assistantError
messages=22
sessionFile=/home/user/.openclaw/agents/atlas/sessions/7f7e70da-05ac-40f6-b6e2-9394ffb5afdd.jsonl
diagId=ovf-mp21lzw6-ZL-7BQ
compactionAttempts=0
observedTokens=unknown
error=Context overflow: estimated context size exceeds safe threshold during tool loop.
agent:atlas:subagent:d1491dac-bc19-4a6b-a47c-e8ec45d9ab30
agent:atlas:subagent:4b7b1cea...
subagent-registry-CSyDa4Jl.js:
- lifecycle phase "end" maps to success without checking livenessState/error metadata
- resolveCompletionFromSessionEntry() maps done/ended states to ok without blocked/error metadata checks
agent-runner.runtime-Ew7ojxcl.js:
- embedded lifecycle terminal backstop propagates livenessState into lifecycle event data
pi-embedded-CM_pfO4f.js:
- context overflow returns error metadata / blocked terminal state
Impact and severity
Affected: Atlas subagent runs using openai-codex/gpt-5.5 when the embedded runtime reaches context_overflow / livenessState: "blocked".
Severity: High — blocks workflow because the parent session receives completed successfully even though the subagent produced no usable output artifact/error.
Frequency: 2/2 observed Atlas incidents with this failure pattern.
Consequence: Failed subagent work appears successful, causing missed artifacts, delayed diagnosis, and risk of proceeding based on a false-success signal.
Additional information
Observed workaround: use lightContext: true on Atlas subagent spawns to reduce bootstrap/context pressure. This is a mitigation, not a fix for the completion bridge behavior.
Bug type
Behavior bug (incorrect output/state without crash)
Beta release blocker
No
Summary
Atlas subagent runs that hit context_overflow / livenessState: "blocked" were reported to the parent as completed successfully with no usable output.
Steps to reproduce
Expected behavior
When the embedded runtime reports livenessState: "blocked" and/or an error payload such as context_overflow, the subagent completion bridge should report the run as blocked/error and surface the error message to the parent.
Actual behavior
The embedded runtime logged context overflow, but the subagent completion path reported the run as completed successfully; the parent saw no usable output artifact/error.
OpenClaw version
OpenClaw 2026.5.6 (c97b9f7)
Operating system
Linux 6.17.0-23-generic x64
Install method
npm global
Model
openai-codex/gpt-5.5
Provider / routing chain
OpenClaw → openai-codex OAuth → GPT-5.5 Codex
Additional provider/model setup details
Atlas is configured as a fixed worker agent with agentId: atlas, primary model openai-codex/gpt-5.5, and no fallbacks. Atlas config showed contextTokens: 400000.
Logs, screenshots, and evidence
Impact and severity
Affected: Atlas subagent runs using openai-codex/gpt-5.5 when the embedded runtime reaches context_overflow / livenessState: "blocked".
Severity: High — blocks workflow because the parent session receives completed successfully even though the subagent produced no usable output artifact/error.
Frequency: 2/2 observed Atlas incidents with this failure pattern.
Consequence: Failed subagent work appears successful, causing missed artifacts, delayed diagnosis, and risk of proceeding based on a false-success signal.
Additional information
Observed workaround: use lightContext: true on Atlas subagent spawns to reduce bootstrap/context pressure. This is a mitigation, not a fix for the completion bridge behavior.