fix: Subagent completion direct announce often fails with no visible reply#82804
fix: Subagent completion direct announce often fails with no visible reply#82804galiniliev wants to merge 3 commits into
Conversation
|
Codex review: needs real behavior proof before merge. Summary Reproducibility: unclear. The review failed before ClawSweeper could establish a reproduction path. Real behavior proof Next step before merge Review detailsBest possible solution: Retry the Codex review after fixing the execution failure. Do we have a high-confidence way to reproduce the issue? Unclear. The review failed before ClawSweeper could establish a reproduction path. Is this the best way to solve the issue? Unclear. Retry the review first so ClawSweeper can evaluate the actual issue and fix direction. What I checked:
Likely related people:
Remaining risk / open question:
Codex review notes: model gpt-5.5, reasoning high; reviewed against 9e67f53b913a. |
72c68a2 to
97ee119
Compare
Summary
completion agent did not produce a visible replyafter the requester wake path hit a stale session id (queue_message_failed reason=no_active_run).no_active_runand the automatic completion-agent handoff returns no visible payload, OpenClaw retries the requester-agent handoff once withsourceReplyDeliveryMode: "message_tool_only"anddeliver: false.Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
Real behavior proof (required for external PRs)
Behavior addressed: Completed subagent direct announces no longer dead-end at the reported
no_active_runplus no-visible-payload path. After the stale requester wake fails and the automatic direct handoff has no visible output, the runtime now retries a mediated requester-agent handoff that requires message-tool delivery instead of raw-sending child output.Real environment tested: Windows local Codex worktree based on
origin/mainc2e9091, Node v24.15.0. Dependencies were partially installed afterpnpm installhit anesbuildpostinstall spawn EPERM; the direct Vitest entry was available and used for the focused delivery seam.Exact steps or command run after this patch:
$env:OPENCLAW_VITEST_MAX_WORKERS='1'; node node_modules\vitest\vitest.mjs run src/agents/subagent-announce-delivery.test.ts --reporter=dotEvidence after fix: Copied terminal capture from the post-review focused delivery run:
The updated assertions simulate
queueEmbeddedPiMessageWithOutcomereturningreason: "no_active_run", then an automatic direct completion handoff with emptypayloads. The runtime performs a second requester-agent handoff withdeliver: false,sourceReplyDeliveryMode: "message_tool_only", and a:message-toolidempotency key; the test only marks delivery successful when the second handoff reports committed message-tool evidence. The fallbacksendMessagemock is not called.Observed result after fix: single completed subagent thread/channel cases with stale requester runs now complete through a mediated message-tool-only retry when the first direct handoff is empty. If the retry still lacks message-tool evidence, delivery remains failed and queued for the existing retry/give-up machinery instead of raw-sending child output.
What was not tested: no live gateway/provider/channel rerun was performed. The after-fix proof is local delivery-seam execution, not a private live session replay.
Before evidence: raw runtime log excerpt from the affected gateway trace that this patch addresses:
Root Cause (if applicable)
sendSubagentAnnounceDirectlydetected the empty automatic completion-agent handoff but did not distinguish the reported stale requester wake (no_active_run) from ordinary no-visible output, so the path could fail without trying the stricter message-tool-only mediated handoff.no_active_runfollowed by an empty automatic direct handoff.Regression Test Plan (if applicable)
src/agents/subagent-announce-delivery.test.tsno_active_run, the automatic requester-agent completion handoff returns empty payloads, and the runtime retries the same mediated handoff withsourceReplyDeliveryMode: "message_tool_only"without raw-sending child completion text.User-visible / Behavior Changes
Completed subagent announcements that previously dead-ended after a stale requester wake and empty automatic handoff can now be retried through a message-tool-only requester-agent handoff, producing a visible update when the requester agent sends through the message tool.
Diagram (if applicable)
Security Impact (required)
Yes, explain risk + mitigation: N/ARepro + Verification
Environment
mainc2e9091 before fix.Steps
reason: "no_active_run".{ result: { payloads: [] } }.deliver: falseandsourceReplyDeliveryMode: "message_tool_only".sendMessagefallback mock is not called.Expected
Actual
Evidence
Human Verification (required)
What you personally verified (not just CI), and how:
Review Conversations
Compatibility / Migration
Risks and Mitigations