Skip to content

[Bug]: Subagent blocked terminals reported as “completed successfully” with no output #80879

@saphoroth

Description

@saphoroth

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

No

Summary

Atlas subagent runs that hit context_overflow / livenessState: "blocked" were reported to the parent as completed successfully with no usable output.

Steps to reproduce

  1. Start OpenClaw 2026.5.6 with an Atlas subagent using openai-codex/gpt-5.5.
  2. Spawn an Atlas subagent task large enough to trigger embedded runtime context overflow.
  3. Observe gateway logs containing [context-overflow-diag].
  4. Observe parent/session completion reports completed successfully with no output artifact or useful error surfaced.

Expected behavior

When the embedded runtime reports livenessState: "blocked" and/or an error payload such as context_overflow, the subagent completion bridge should report the run as blocked/error and surface the error message to the parent.

Actual behavior

The embedded runtime logged context overflow, but the subagent completion path reported the run as completed successfully; the parent saw no usable output artifact/error.

OpenClaw version

OpenClaw 2026.5.6 (c97b9f7)

Operating system

Linux 6.17.0-23-generic x64

Install method

npm global

Model

openai-codex/gpt-5.5

Provider / routing chain

OpenClaw → openai-codex OAuth → GPT-5.5 Codex

Additional provider/model setup details

Atlas is configured as a fixed worker agent with agentId: atlas, primary model openai-codex/gpt-5.5, and no fallbacks. Atlas config showed contextTokens: 400000.

Logs, screenshots, and evidence

2026-05-11T22:00:06.776-05:00 [agent/embedded] [context-overflow-diag]
sessionKey=agent:atlas:subagent:d1491dac-bc19-4a6b-a47c-e8ec45d9ab30
provider=openai-codex/gpt-5.5
source=assistantError
messages=22
sessionFile=/home/user/.openclaw/agents/atlas/sessions/7f7e70da-05ac-40f6-b6e2-9394ffb5afdd.jsonl
diagId=ovf-mp21lzw6-ZL-7BQ
compactionAttempts=0
observedTokens=unknown
error=Context overflow: estimated context size exceeds safe threshold during tool loop.

agent:atlas:subagent:d1491dac-bc19-4a6b-a47c-e8ec45d9ab30
agent:atlas:subagent:4b7b1cea...

subagent-registry-CSyDa4Jl.js:
- lifecycle phase "end" maps to success without checking livenessState/error metadata
- resolveCompletionFromSessionEntry() maps done/ended states to ok without blocked/error metadata checks

agent-runner.runtime-Ew7ojxcl.js:
- embedded lifecycle terminal backstop propagates livenessState into lifecycle event data

pi-embedded-CM_pfO4f.js:
- context overflow returns error metadata / blocked terminal state

Impact and severity

Affected: Atlas subagent runs using openai-codex/gpt-5.5 when the embedded runtime reaches context_overflow / livenessState: "blocked".

Severity: High — blocks workflow because the parent session receives completed successfully even though the subagent produced no usable output artifact/error.

Frequency: 2/2 observed Atlas incidents with this failure pattern.

Consequence: Failed subagent work appears successful, causing missed artifacts, delayed diagnosis, and risk of proceeding based on a false-success signal.

Additional information

Observed workaround: use lightContext: true on Atlas subagent spawns to reduce bootstrap/context pressure. This is a mitigation, not a fix for the completion bridge behavior.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingbug:behaviorIncorrect behavior without a crash

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions