Bug type
Behavior bug (incorrect output/state without crash)
Beta release blocker
No
Summary
The session status field uses values like "failed", "timeout", and "done" that only reflect the last communication turn's state, but these names strongly imply the session or task has permanently ended. This causes orchestrator agents to spawn duplicate sessions instead of resuming existing ones via sessions_send.
Steps to reproduce
- Configure an orchestrator agent (main) with sub-agents.
- Spawn a sub-agent session via
sessions_spawn for a long-running task.
- The sub-agent session encounters an error or timeout on its last API call.
- Observe the session's
status field becomes "failed" or "timeout".
- The orchestrator agent sees this status and incorrectly concludes the session is dead.
- The orchestrator spawns a NEW session for the same task instead of using
sessions_send to resume the existing one.
- In observed incidents, the same task had 2-4 concurrent sessions, all burning tokens on paid models.
Expected behavior
Session status values should clearly communicate they are communication/turn states, not session lifecycle states. A session with status: "failed" or status: "timeout" should still be resumable via sessions_send, and the field naming or documentation should make this obvious to both human operators and AI agents.
Actual behavior
Orchestrator agents see "failed" or "timeout" and spawn duplicate sessions for the same task. In one observed incident, a single task had 4 concurrent sessions (3 unnecessary), wasting paid-model tokens. The original session's accumulated context (file reads, analysis, partial work) is lost when a new session starts from scratch. The cascade can compound: each new session may also "fail", triggering yet another spawn.
OpenClaw version
current (2026.4)
Operating system
Ubuntu 22.04
Install method
npm global
Model
various (openrouter/z-ai/glm-5.1, dashscope/qwen3.6-plus, dashscope/deepseek-v3.2)
Provider / routing chain
openclaw -> openrouter -> various models
Additional provider/model setup details
No response
Logs, screenshots, and evidence
Impact and severity
Affected: All users running orchestrator/agentic agents that manage sub-agent sessions
Severity: High (causes token waste on paid models, session sprawl, context loss)
Frequency: Always when a sub-agent session's last turn results in an error or timeout
Consequence: Duplicate sessions burning 2-4x the necessary tokens; lost context from abandoned sessions; session management complexity that compounds over time
Additional information
Proposed Solutions
Option A: Rename status values (breaking change)
Rename to clearly communicate they are communication states, not task states:
- "failed" → "interrupted"
- "timeout" → "waiting_for_response"
- "done" → "agent_responded"
- "killed" → "halted"
Option B: Add a separate field (non-breaking)
Keep status for backward compatibility but add:
resumable: true/false — Whether the session can still receive sessions_send
taskStatus: "in_progress" | "completed" | "unknown" — The actual task state (only set by explicit declaration)
Option C: Add explicit documentation in API response
Include a resumable: true field and a note field on every session:
{
"status": "failed",
"resumable": true,
"note": "status reflects last turn state only; session is alive and can receive sessions_send"
}
### Option D: Change the default agent system prompt
Add explicit guidance in the default system prompt:
> "A session's `status` field only indicates the last turn's communication state. `failed`, `timeout`, and `done` do NOT mean the session is unusable. Consider using `sessions_send` to resume an existing session before considering a new `sessions_spawn`."
### Recommendation
Option A is the cleanest long-term fix
Bug type
Behavior bug (incorrect output/state without crash)
Beta release blocker
No
Summary
The session
statusfield uses values like"failed","timeout", and"done"that only reflect the last communication turn's state, but these names strongly imply the session or task has permanently ended. This causes orchestrator agents to spawn duplicate sessions instead of resuming existing ones viasessions_send.Steps to reproduce
sessions_spawnfor a long-running task.statusfield becomes"failed"or"timeout".sessions_sendto resume the existing one.Expected behavior
Session status values should clearly communicate they are communication/turn states, not session lifecycle states. A session with
status: "failed"orstatus: "timeout"should still be resumable viasessions_send, and the field naming or documentation should make this obvious to both human operators and AI agents.Actual behavior
Orchestrator agents see
"failed"or"timeout"and spawn duplicate sessions for the same task. In one observed incident, a single task had 4 concurrent sessions (3 unnecessary), wasting paid-model tokens. The original session's accumulated context (file reads, analysis, partial work) is lost when a new session starts from scratch. The cascade can compound: each new session may also "fail", triggering yet another spawn.OpenClaw version
current (2026.4)
Operating system
Ubuntu 22.04
Install method
npm global
Model
various (openrouter/z-ai/glm-5.1, dashscope/qwen3.6-plus, dashscope/deepseek-v3.2)
Provider / routing chain
openclaw -> openrouter -> various models
Additional provider/model setup details
No response
Logs, screenshots, and evidence
Impact and severity
Affected: All users running orchestrator/agentic agents that manage sub-agent sessions
Severity: High (causes token waste on paid models, session sprawl, context loss)
Frequency: Always when a sub-agent session's last turn results in an error or timeout
Consequence: Duplicate sessions burning 2-4x the necessary tokens; lost context from abandoned sessions; session management complexity that compounds over time
Additional information
Proposed Solutions
Option A: Rename status values (breaking change)
Rename to clearly communicate they are communication states, not task states:
Option B: Add a separate field (non-breaking)
Keep
statusfor backward compatibility but add:resumable: true/false— Whether the session can still receivesessions_sendtaskStatus: "in_progress" | "completed" | "unknown"— The actual task state (only set by explicit declaration)Option C: Add explicit documentation in API response
Include a
resumable: truefield and anotefield on every session:{ "status": "failed", "resumable": true, "note": "status reflects last turn state only; session is alive and can receive sessions_send" } ### Option D: Change the default agent system prompt Add explicit guidance in the default system prompt: > "A session's `status` field only indicates the last turn's communication state. `failed`, `timeout`, and `done` do NOT mean the session is unusable. Consider using `sessions_send` to resume an existing session before considering a new `sessions_spawn`." ### Recommendation Option A is the cleanest long-term fix