Bug type
Regression (worked before, now fails)
Beta release blocker
No
Summary
Deferred subagent completion delivery is unreliable across retry/reconnect boundaries, and parent-facing final delivery
can be delayed or missed even when the child run completes successfully.
Steps to reproduce
- Run OpenClaw 2026.4.9.
- Start a parent session and launch a longer-running subagent / ACP child run.
- While the child is still running, force a reconnect-sensitive path such as gateway restart or delivery
interruption/disconnect before final parent delivery.
- Wait for the child run to finish successfully.
- Observe that completion is recorded, but parent-facing final delivery is not reliably emitted automatically/timely in
the affected reconnect-gap path.
Expected behavior
When a child/subagent run finishes successfully, deferred final delivery state should remain durable across
retry/restart/reconnect paths and the parent session should receive the final completion delivery once the system is
able to retry it.
Actual behavior
In reconnect-gap / deferred-delivery paths, child completion can be observed successfully while parent-facing final
delivery remains unreliable. In local debugging, a failed retry path was also able to overwrite durable pending-delivery
context with live run fields, weakening later retry/cleanup behavior.
OpenClaw version
2026.4.9
Operating system
Linux 6.6.114.1-microsoft-standard-WSL2 (x64)
Install method
npm global (live gateway), with source checkout used for patch/test verification
Model
Multiple ACP subagent runs observed; not isolated to a single model
Provider / routing chain
OpenClaw gateway -> ACP subagent session -> parent completion delivery
Additional provider/model setup details
The observed failure appears in subagent completion delivery / follow-up lifecycle behavior rather than a
provider-specific model output path. Reproductions involved ACP child runs and parent completion delivery timing across
reconnect-sensitive conditions.
Logs, screenshots, and evidence
Observed on latest local install after cutover to 2026.4.9.
Grounded observations:
- short smoke passed
- long no-restart smoke passed
- reconnect-gap path remained unreliable
- child/subagent work completed successfully
- parent-facing final delivery did not return cleanly/timely in the affected path
Local debugging also found a concrete durability bug:
- retry-state writes could overwrite existing durable pending-delivery payload with transient live fields
Local hardening patch added:
- payload preservation during retry-state writes
- persistence/restart coverage
- targeted tests for cleanup/persistence behavior
Impact and severity
Affected: users relying on parent-visible subagent/ACP completion delivery across reconnect/retry boundaries
Severity: High for affected flows, because final results can be delayed or effectively missed from the parent/user
perspective
Frequency: Intermittent, specifically observed in reconnect-gap / deferred-delivery paths, not in every completion path
Consequence: parent sessions may not receive reliable final completion delivery even though child work completed
successfully
Additional information
This report is about observed reliability behavior, not a claim of a complete root-cause fix.
A local hardening direction appears promising:
- preserve deferred completion payload durably across retry-state writes
- keep final delivery as a parent-owned retry obligation rather than relying on a transient handoff moment
However, current evidence does not yet justify claiming a full end-to-end fix for all reconnect-gap scenarios.
Bug type
Regression (worked before, now fails)
Beta release blocker
No
Summary
Deferred subagent completion delivery is unreliable across retry/reconnect boundaries, and parent-facing final delivery
can be delayed or missed even when the child run completes successfully.
Steps to reproduce
interruption/disconnect before final parent delivery.
the affected reconnect-gap path.
Expected behavior
When a child/subagent run finishes successfully, deferred final delivery state should remain durable across
retry/restart/reconnect paths and the parent session should receive the final completion delivery once the system is
able to retry it.
Actual behavior
In reconnect-gap / deferred-delivery paths, child completion can be observed successfully while parent-facing final
delivery remains unreliable. In local debugging, a failed retry path was also able to overwrite durable pending-delivery
context with live run fields, weakening later retry/cleanup behavior.
OpenClaw version
2026.4.9
Operating system
Linux 6.6.114.1-microsoft-standard-WSL2 (x64)
Install method
npm global (live gateway), with source checkout used for patch/test verification
Model
Multiple ACP subagent runs observed; not isolated to a single model
Provider / routing chain
OpenClaw gateway -> ACP subagent session -> parent completion delivery
Additional provider/model setup details
The observed failure appears in subagent completion delivery / follow-up lifecycle behavior rather than a
provider-specific model output path. Reproductions involved ACP child runs and parent completion delivery timing across
reconnect-sensitive conditions.
Logs, screenshots, and evidence
Impact and severity
Affected: users relying on parent-visible subagent/ACP completion delivery across reconnect/retry boundaries
Severity: High for affected flows, because final results can be delayed or effectively missed from the parent/user
perspective
Frequency: Intermittent, specifically observed in reconnect-gap / deferred-delivery paths, not in every completion path
Consequence: parent sessions may not receive reliable final completion delivery even though child work completed
successfully
Additional information
This report is about observed reliability behavior, not a claim of a complete root-cause fix.
A local hardening direction appears promising:
However, current evidence does not yet justify claiming a full end-to-end fix for all reconnect-gap scenarios.