Skip to content

fix(agents): stop raw tool output leaking into subagent completion announces#80049

Closed
blaspat wants to merge 1 commit into
openclaw:mainfrom
blaspat:fix/79986-subagent-announce-raw-output-leak
Closed

fix(agents): stop raw tool output leaking into subagent completion announces#80049
blaspat wants to merge 1 commit into
openclaw:mainfrom
blaspat:fix/79986-subagent-announce-raw-output-leak

Conversation

@blaspat

@blaspat blaspat commented May 10, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes #79986 — Subagent announce leaks raw tool output to Telegram and duplicates parent progress reply.

Root cause: selectSubagentOutputText() in src/agents/subagent-announce-output.ts was falling through to snapshot.latestRawText (from tool/toolResult messages) when no assistant-produced text was available. Raw tool output is not user-facing content and must not be delivered as completion text.

Fix: Return undefined instead of snapshot.latestRawText so the caller falls back to readLatestAssistantReply() (post-compaction result) or synthesizes a bounded failure summary.

Files changed

  • src/agents/subagent-announce-output.ts — Stop selectSubagentOutputText from returning raw tool output
  • src/agents/subagent-announce.format.e2e.test.ts — Updated tests that expected raw tool output to be announced to correctly expect (no output)

Acceptance criteria

  • pnpm test --run src/agents/subagent-announce.format.e2e.test.ts src/agents/subagent-announce-output.test.ts src/agents/subagent-announce-delivery.test.ts src/gateway/server-methods/agent.test.ts
  • pnpm test --run src/gateway/server-chat.agent-events.test.ts
  • pnpm exec oxfmt --check --threads=1 on: ...

Real Behavior Proof

Before fix (buggy behavior):

A subagent completes a task with tool-only output (no assistant text). The completion announce sends raw tool output to Telegram:

[Subagent] /exec completed
/root/openclaw/src/gateway/protocol/schema/protocol-schemas.ts:181:  PluginControlUiDescriptorSchema,
/root/openclaw/src/gateway/protocol/schema/protocol-schemas.ts:182:  PluginRuntimeConfigSchema,
... (raw tool result content visible to user)

After fix (correct behavior):

Same scenario now produces a bounded (no output) summary instead of raw tool text:

[Subagent] completed with result: (no output)

How to reproduce:

  1. Start a Telegram DM session with a parent agent
  2. Spawn a subagent (sessions_spawn runtime="subagent") that runs a task producing tool-only output (e.g., a file write with no assistant reply)
  3. Observe the completion message — before fix: raw tool content visible; after fix: (no output)

@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: XS triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 10, 2026
@clawsweeper

clawsweeper Bot commented May 10, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge.

Summary
The PR changes subagent completion output selection to return no selected result for tool-only histories, updates e2e expectations to (no output), and revises the subagents docs for that contract.

Reproducibility: yes. for source-level reproduction: current main selects snapshot.latestRawText for tool-only histories and has e2e expectations that assert tool output appears in completion messages. I did not establish a live Telegram reproduction in this read-only review.

Real behavior proof
Needs stronger real behavior proof before merge: Insufficient: the PR body has before/after snippets, but no clearly sourced after-fix Telegram/runtime proof; add a screenshot, recording, terminal output, copied live output, linked artifact, or redacted logs, then update the PR body to trigger re-review or ask for @clawsweeper re-review.

Next step before merge
Needs contributor-supplied real behavior proof before merge; the remaining docs repair is narrow, but automation cannot supply the contributor's Telegram/runtime proof.

Security
Cleared: Cleared: the diff narrows a raw-output exposure path and only touches TypeScript announce-selection logic, tests, and docs, with no dependency, workflow, package, secret, or supply-chain surface changes.

Review findings

  • [P3] Update the tasks cleanup fallback docs — docs/automation/tasks.md:260
Review details

Best possible solution:

Land one canonical raw-output leak fix with aligned subagent and tasks docs plus real Telegram/runtime proof, then track the duplicate-progress symptom separately unless this PR proves it too.

Do we have a high-confidence way to reproduce the issue?

Yes for source-level reproduction: current main selects snapshot.latestRawText for tool-only histories and has e2e expectations that assert tool output appears in completion messages. I did not establish a live Telegram reproduction in this read-only review.

Is this the best way to solve the issue?

Yes for the central raw-output leak: returning undefined is the narrow fix because it lets readLatestAssistantReply or (no output) handle tool-only histories instead of replaying tool text. It is not enough by itself to prove the duplicate-progress portion of the linked bug is fixed.

Full review comments:

  • [P3] Update the tasks cleanup fallback docs — docs/automation/tasks.md:260
    This PR removes the tool/toolResult fallback from subagent completion output, but docs/automation/tasks.md still says subagent completion falls back to sanitized latest tool/toolResult text. That would leave a public docs page describing the raw-output behavior the patch is removing, so please align this line with the new (no output) contract.
    Confidence: 0.82

Overall correctness: patch is correct
Overall confidence: 0.86

What I checked:

  • Current-main raw fallback: On current main, selectSubagentOutputText still returns snapshot.latestRawText after assistant text and partial-progress checks fail, so a tool-only transcript can become completion text. (src/agents/subagent-announce-output.ts:304, 4f053b87048f)
  • Current-main regression expectations: The current e2e coverage expects tool/toolResult text to appear when assistant output is empty, matching the reported leak path. (src/agents/subagent-announce.format.e2e.test.ts:637, 4f053b87048f)
  • No-output runtime fallback: The announce flow already uses (no output) when no child completion findings or reply text are available, so returning undefined from the selector routes tool-only histories to the bounded fallback. (src/agents/subagent-announce.ts:467, 4f053b87048f)
  • PR diff behavior: The provided PR diff at head d43cb97054e1129459b25af012d50b2d927599e5 replaces the raw fallback with return undefined, changes the affected e2e expectations to (no output), and updates docs/tools/subagents.md. (src/agents/subagent-announce-output.ts:309, d43cb97054e1)
  • Remaining stale public docs: docs/automation/tasks.md still says subagent completion falls back to sanitized latest tool/toolResult text, which would be stale after this behavior change lands. (docs/automation/tasks.md:260, 4f053b87048f)
  • Related scope: The linked bug report also covers duplicate parent progress and internal event envelope spam; this PR directly fixes the raw child tool-output selection portion, while the duplicate-progress portion is not proven fixed here.

Likely related people:

  • Peter Steinberger: Git history shows multiple central changes to subagent announce output behavior, including reading latest assistant/tool output and splitting the output helper module. (role: recent area contributor; confidence: high; commits: 81db05962752, 0dd97feb41f8, 540b98b23f7b; files: src/agents/subagent-announce-output.ts, src/agents/subagent-announce.ts, docs/tools/subagents.md)
  • Roshan Singh: Git history shows the structured subagent announce output and run outcome work, which is the result contract this PR changes. (role: feature-history contributor; confidence: medium; commits: 1baa55c1456b; files: src/agents/subagent-announce-output.ts, src/agents/subagent-announce.ts)
  • Tyler Yust: Git history shows recent work on subagent announce delivery routing, adjacent to how selected child results are delivered to requester channels. (role: adjacent announce delivery contributor; confidence: medium; commits: 41cf93efff4d; files: src/agents/subagent-announce-delivery.ts)
  • VACInc: The previous ClawSweeper review comment reported blame provenance for the raw-fallback selector and matching e2e expectations at this commit; local metadata confirms the commit but the checkout history is partial. (role: behavior provenance signal; confidence: low; commits: 852757ad2f57; files: src/agents/subagent-announce-output.ts, src/agents/subagent-announce.format.e2e.test.ts)

Remaining risk / open question:

  • The PR body has copied before/after snippets, but not an inspectable after-fix Telegram/runtime run or redacted diagnostic output from the changed path.
  • The linked issue also reports duplicate parent progress/internal event spam, and this PR only proves the raw tool-output selection fix.
  • docs/automation/tasks.md would still describe the removed tool/toolResult fallback after merge.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 4f053b87048f.

…nounces

selectSubagentOutputText() was falling through to snapshot.latestRawText
when no assistant-produced text was available. Raw tool output is not
user-facing content and must not be delivered as completion text.

Fix: return undefined instead of snapshot.latestRawText so the caller
falls back to readLatestAssistantReply (post-compaction result) or
synthesizes a bounded failure summary.

Updated tests that expected raw tool output to be announced — they now
correctly expect '(no output)' since no assistant-composed text exists.

Signed-off-by: Blasius Patrick <blasius.patrick@gmail.com>
@blaspat blaspat force-pushed the fix/79986-subagent-announce-raw-output-leak branch from ff96117 to d43cb97 Compare May 10, 2026 03:52
@openclaw-barnacle openclaw-barnacle Bot added the docs Improvements or additions to documentation label May 10, 2026
@steipete

Copy link
Copy Markdown
Contributor

Thanks @blaspat. Closing this in favor of #80110, which preserves the core fix from this PR and credits your commit in the replacement maintainer commit.

Why this PR was not mergeable as-is:

  • It claimed Fixes #79986, but only fixed the raw child tool-output selection path. The duplicate parent progress/internal-event spam part of [Bug] Subagent announce leaks raw tool output to Telegram and duplicates parent progress reply #79986 was not proven fixed, so the PR overclaimed the scope.
  • It left docs/automation/tasks.md describing the old sanitized tool/toolResult fallback, so public docs would still document the behavior being removed.
  • The tests changed the existing e2e expectations, but did not add direct coverage for readSubagentOutput() proving tool-only histories return no selected completion output while post-compaction assistant replies still work.
  • The Real behavior proof check failed because the PR body did not include the required structured after-fix proof from a real OpenClaw setup.

Replacement PR #80110 keeps the important runtime change, adds the missing docs/test/changelog coverage, and includes Crabbox/Testbox proof:

  • Pre-fix probe: {"result":"raw grep output","leaked":true}
  • Replacement probe: {"result":null,"leaked":false}
  • Targeted tests: 3 Vitest shards, 207 tests passed

Thanks again for identifying the right core selector bug.

@steipete steipete closed this May 10, 2026
@blaspat blaspat deleted the fix/79986-subagent-announce-raw-output-leak branch May 10, 2026 17:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling docs Improvements or additions to documentation size: XS triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Subagent announce leaks raw tool output to Telegram and duplicates parent progress reply

2 participants