fix(agents): suppress raw subagent tool output#80110
Conversation
Based on #80049. Co-authored-by: Blasius Patrick <blasius.patrick@gmail.com>
|
Codex review: needs maintainer review before merge. Summary Reproducibility: yes. Current main source and tests show the selector can return raw tool/toolResult text when no assistant text exists, and the PR body includes a pre-fix Testbox probe demonstrating the leak. Real behavior proof Next step before merge Security Review detailsBest possible solution: Land this focused selector/docs/test fix after normal maintainer review and exact-head checks, while tracking any remaining duplicate-progress behavior separately if it still reproduces. Do we have a high-confidence way to reproduce the issue? Yes. Current main source and tests show the selector can return raw tool/toolResult text when no assistant text exists, and the PR body includes a pre-fix Testbox probe demonstrating the leak. Is this the best way to solve the issue? Yes. Returning undefined only after assistant, silent, and timeout partial-progress paths are exhausted is the narrow maintainable fix, and it preserves post-compaction assistant fallback plus bounded What I checked:
Likely related people:
Remaining risk / open question:
Codex review notes: model gpt-5.5, reasoning high; reviewed against 6d89bf65e08a. |
Summary
This keeps the correct core fix from #80049:
selectSubagentOutputText()must not fall through tosnapshot.latestRawTextwhen a child subagent produced only tool/toolResult output. Tool output is not user-facing completion text, so the requester session should receive(no output)or a post-compaction assistant reply instead of raw command/search output.The replacement also fixes the gaps in #80049:
readSubagentOutput()Verification
pnpm test src/agents/subagent-announce-output.test.ts src/agents/subagent-announce.format.e2e.test.ts src/agents/subagent-announce-delivery.test.ts src/gateway/server-methods/agent.test.tspnpm exec oxfmt --check --threads=1 CHANGELOG.md docs/automation/tasks.md docs/tools/subagents.md src/agents/subagent-announce-output.ts src/agents/subagent-announce-output.test.ts src/agents/subagent-announce.format.e2e.test.tstbx_01kr83we61rvr3xjxxbtdme3x2, workflow run https://github.com/openclaw/openclaw/actions/runs/25620248580Real behavior proof (required for external PRs)
tool/toolResultcontent as completion text when no assistant-produced text existed. In Telegram this can surface code/search/exec dumps from the child transcript. This PR makes tool-only histories produce no selected completion output, so the existing announce fallback reports(no output)instead of leaking raw tool text.maint/subagent-no-raw-outputsynced from this PR checkout./Users/steipete/Projects/crabbox/bin/crabbox run --provider blacksmith-testbox --blacksmith-org openclaw --blacksmith-workflow .github/workflows/ci-check-testbox.yml --blacksmith-job check --blacksmith-ref main --idle-timeout 90m --ttl 240m --timing-json --shell -- '<probe tool-only child history>; oxfmt --check; targeted pnpm test ...'. The probe importssrc/agents/subagent-announce-output.ts, stubs Gatewaychat.historywith assistant-empty +toolResult: raw grep output, callsreadSubagentOutput("agent:main:subagent:child"), and fails if the result contains raw tool output or returns any non-undefined text.tbx_01kr83we61rvr3xjxxbtdme3x2, workflow run https://github.com/openclaw/openclaw/actions/runs/25620248580:Pre-fix comparison from Crabbox/Testbox
tbx_01kr83fbqn923dkp3zja0stp6d:readSubagentOutput(). The Crabbox probe result isnull/undefined withleaked:false, and the announce-format tests verify requester completion handoff contains(no output)and does not contain the raw tool text.leaked:true, shown above.Root Cause
readSubagentOutput()rejects tool-only histories or that post-compaction assistant text still wins over raw tool text.Regression Test Plan
src/agents/subagent-announce-output.test.tsandsrc/agents/subagent-announce.format.e2e.test.ts.User-visible / Behavior Changes
Subagent completion announces no longer surface raw tool output when the child produced no assistant text. They now use the existing
(no output)fallback for that case.Diagram
N/A