Fix subagent DM completion delivery after yield#88182
Conversation
|
Codex review: needs maintainer review before merge. Reviewed May 29, 2026, 9:22 PM ET / 01:22 UTC. Summary PR surface: Source +27, Tests +56. Total +83 across 2 files. Reproducibility: yes. from source and linked issue evidence, but I did not run a live reproduction in this read-only review. Current main still treats direct-message subagent completions as automatic final delivery, while the linked report shows that path can lose the visible completion after yield. Review metrics: 1 noteworthy metric.
Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Mantis proof suggestion Risk before merge
Maintainer options:
Next step before merge
Security Review detailsBest possible solution: Land the narrow fix if maintainers agree that direct-message subagent completions after yield should use message-tool delivery and child-result fallback rather than automatic parent final delivery. Do we have a high-confidence way to reproduce the issue? Yes from source and linked issue evidence, but I did not run a live reproduction in this read-only review. Current main still treats direct-message subagent completions as automatic final delivery, while the linked report shows that path can lose the visible completion after yield. Is this the best way to solve the issue? Unclear until maintainer review accepts the contract change. The patch is narrow and reuses the existing message-tool-only delivery machinery, but it deliberately changes user-visible DM completion semantics. AGENTS.md: found and applied where relevant. Codex review notes: model gpt-5.5, reasoning high; reviewed against 6f3f4f74207a. Label changesLabel changes:
Label justifications:
Evidence reviewedPR surface: Source +27, Tests +56. Total +83 across 2 files. View PR surface stats
Acceptance criteria:
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
(cherry picked from commit 00d87c7)
(cherry picked from commit 00d87c7)
(cherry picked from commit 00d87c7)
(cherry picked from commit 00d87c7)
Summary
NO_REOUTPUTshape and the new message-tool-only direct-DM contract.Fixes #88042.
Verification
.agents/skills/autoreview/scripts/autoreview --mode localautoreview clean: no accepted/actionable findings reported.node scripts/run-vitest.mjs src/agents/subagent-announce-delivery.test.ts2 passed,178 passed.node scripts/run-vitest.mjs src/agents/subagent-announce-delivery.test.ts -t "directly delivers direct-message subagent text when the announce agent omits the result|requires message-tool delivery for direct subagent completions"2 passed,4 passed,174 skipped.git diff --checkpnpm buildReal behavior proof
Behavior addressed: direct-message subagent completion after
sessions_yieldremains externally visible when the parent does not re-output the child result.Real environment tested: local live Telegram DM using
openai/gpt-5.5and the patched checkout. AWS/Azure Crabbox live proof was attempted but blocked by coordinatorcost_limit_exceeded; Blacksmith Testbox refused secret env forwarding.Exact steps or command run after this patch: started a temporary live Telegram/OpenAI harness from the patched checkout, sent a DM prompt that spawned a child running
sleep 90; echo TG88042_CHILD_9855743C, then forced the parent to reply onlyTG88042_NO_REOUTPUT_9855743Cif it had to produce a final answer.Evidence after fix: harness summary reported
fixed: true, child taskcompleted, task progress summaryTG88042_CHILD_9855743C, parentTG88042_NO_REOUTPUT_9855743Cgenerated internally, and Telegram outbound send succeeded after child completion.Observed result after fix: human Telegram DM view showed only
TG88042_CHILD_9855743C; it did not showTG88042_NO_REOUTPUT_9855743C.What was not tested: completed remote Crabbox live proof, due coordinator budget limit; non-DM channel surfaces beyond the focused regression suite.