Skip to content

fix(agents): clean subagent fallback scaffolding#78700

Merged
steipete merged 4 commits into
mainfrom
fix/subagent-completion-fallback-scaffolding
May 7, 2026
Merged

fix(agents): clean subagent fallback scaffolding#78700
steipete merged 4 commits into
mainfrom
fix/subagent-completion-fallback-scaffolding

Conversation

@steipete

@steipete steipete commented May 7, 2026

Copy link
Copy Markdown
Contributor

Summary

  • replace generated <<<BEGIN_UNTRUSTED_CHILD_RESULT>>> prompt sentinels with neutral <prompt-data> child-result blocks for parent-agent announce prompts
  • simplify completion delivery so background child results stay on the requester-agent handoff / queue-retry path instead of raw-sending child output directly to the external chat
  • strip runtime-context/prompt-data scaffolding before write-ahead outbound queue persistence and rebuild queued batch plans from cleaned payloads
  • update subagent docs and regression coverage for the simplified delivery contract

Fixes #78531.

Real behavior proof

  • Behavior addressed: background subagent completions no longer bypass the requester-agent handoff by raw-sending child output; active wake failures go to queue fallback, and failed/no-output requester-agent handoffs are reported as failed instead of leaking wrapper/runtime scaffolding to the external chat.
  • Real environment tested: Blacksmith Testbox Linux worker tbx_01kr071t6gf9j80yhfpe94ezj2 running the rebased branch against current OpenClaw source.
  • Exact steps or command run after this patch: env CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 pnpm check:changed
  • Evidence after fix: terminal output from the Testbox run included check:changed lanes=core, coreTests, docs, Found 0 warnings and 0 errors, Import cycle check: 0 runtime value cycle(s)., and final JSON {"provider":"blacksmith-testbox","leaseId":"tbx_01kr071t6gf9j80yhfpe94ezj2","exitCode":0}.
  • Observed result after fix: the rebased branch passed core/core-test/docs changed-gate proof on the remote Linux worker, including core typecheck, core test typecheck, core lint, runtime sidecar loader guard, import-cycle guard, webhook body guard, and auth pairing guards.
  • What was not tested: no live Telegram/Slack/Discord bot send was performed for this refactor; the behavior is covered by focused delivery/announce regression tests and the remote changed gate.

Verification

  • pnpm test src/agents/subagent-announce-delivery.test.ts src/agents/subagent-announce-dispatch.test.ts src/agents/subagent-announce.format.e2e.test.ts -- --reporter=verbose
  • pnpm test src/agents/sanitize-for-prompt.test.ts src/agents/subagent-announce-delivery.test.ts src/infra/outbound/sanitize-text.test.ts src/infra/outbound/deliver.test.ts src/agents/subagent-announce.format.e2e.test.ts src/agents/pi-embedded-helpers.sanitizeuserfacingtext.test.ts -- --reporter=verbose
  • pnpm exec oxfmt --check --threads=1 src/agents/subagent-announce-delivery.ts src/agents/subagent-announce-delivery.test.ts src/agents/subagent-announce-dispatch.ts docs/tools/subagents.md CHANGELOG.md
  • git diff --check
  • pnpm changed:lanes --json
  • Crabbox/Testbox pnpm check:changed on rebased head tbx_01kr071t6gf9j80yhfpe94ezj2 (exitCode=0)

@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: S maintainer Maintainer-authored PR labels May 7, 2026
@clawsweeper

clawsweeper Bot commented May 7, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge.

Summary
This PR changes subagent completion handoff/fallback delivery, queues sanitized outbound payloads, switches child-result prompt wrappers, and updates docs, tests, and the changelog.

Reproducibility: yes. at source level. Current main returns raw event.result for completion fallback and persists original outbound payloads to the write-ahead queue, matching the linked Telegram queue evidence.

Real behavior proof
Not applicable: The PR has the protected maintainer label, so the external-contributor proof gate is not applied; the body reports Testbox and targeted test proof but no live chat send.

Next step before merge
A narrow automated repair can fix the sanitizer false-positive and add focused coverage without a product decision.

Security
Cleared: No supply-chain, workflow, dependency, secret-handling, or permission regression was found; the remaining concern is the functional sanitizer false-positive tracked in the review finding.

Review findings

  • [P2] Gate prompt-tag stripping on generated wrappers — src/infra/outbound/sanitize-text.ts:94-96
Review details

Best possible solution:

Keep the layered fallback and queue hardening, but narrow prompt-data unwrapping so only generated wrapper blocks are removed while literal tag examples are preserved.

Do we have a high-confidence way to reproduce the issue?

Yes, at source level. Current main returns raw event.result for completion fallback and persists original outbound payloads to the write-ahead queue, matching the linked Telegram queue evidence.

Is this the best way to solve the issue?

No, not as currently written. The fallback and queue direction is maintainable, but unwrapPromptDataWrapperLines() should require a generated header plus matching wrapper before removing prompt-data or untrusted-text tag lines.

Full review comments:

  • [P2] Gate prompt-tag stripping on generated wrappers — src/infra/outbound/sanitize-text.ts:94-96
    unwrapPromptDataWrapperLines() drops standalone <prompt-data>, </prompt-data>, <untrusted-text>, and close-tag lines even when they are literal XML/code content from the user. Since this sanitizer now runs before outbound send and queue persistence, legitimate content can be silently changed; only unwrap these tags when they are part of a generated header plus matching wrapper block.
    Confidence: 0.91

Overall correctness: patch is incorrect
Overall confidence: 0.9

Acceptance criteria:

  • pnpm test src/infra/outbound/sanitize-text.test.ts src/infra/outbound/deliver.test.ts src/agents/subagent-announce-delivery.test.ts src/agents/subagent-announce.format.e2e.test.ts src/agents/sanitize-for-prompt.test.ts -- --reporter=verbose
  • pnpm exec oxfmt --check --threads=1 src/infra/outbound/sanitize-text.ts src/infra/outbound/sanitize-text.test.ts
  • git diff --check
  • Use Testbox for pnpm check:changed if the changed gate expands beyond the narrow touched surface.

What I checked:

Likely related people:

  • steipete: GitHub path history shows repeated recent current-main work on subagent announce delivery and outbound queue lifecycle, including media completion fallback handling and durable outbound routing. (role: recent maintainer and adjacent owner; confidence: high; commits: 7188e4f4ad87, b32d4c5255c5, 6c8974f3f5a9; files: src/agents/subagent-announce-delivery.ts, src/infra/outbound/deliver.ts, src/agents/internal-events.ts)
  • vincentkoc: Current-line blame in the checked-out main points to Vincent Koc for the fallback, queue, and sanitizer lines, and GitHub history shows recent subagent completion and security/sanitizer work in the same area. (role: recent fallback and sanitizer maintainer; confidence: high; commits: 6587832f2583, b6f9b5f21e84, e80de466e5e1; files: src/agents/subagent-announce-delivery.ts, src/infra/outbound/deliver.ts, src/infra/outbound/sanitize-text.ts)
  • sfuminya: GitHub history shows a recent merged fix preserving requester routes for subagent completion delivery, which is adjacent to this fallback path. (role: adjacent completion-route contributor; confidence: medium; commits: 2c57d70a10db; files: src/agents/subagent-announce-delivery.ts)

Remaining risk / open question:

  • I did not run tests or live channel delivery in this read-only review; validation is from source inspection, PR body proof, and GitHub check-run metadata.
  • No live Telegram/Slack/Discord send proof is present for the refactor, although the protected maintainer label makes the external-contributor real-behavior proof gate not applicable.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 58fa23b4a2f2.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e73c149919

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +94 to +96
if (isPromptDataTagLine(line, "open") || isPromptDataTagLine(line, "close")) {
changed = true;
continue;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Only unwrap prompt-data tags when in generated wrapper

stripInternalRuntimeScaffolding now drops any standalone <prompt-data>/</prompt-data> (and <untrusted-text>) lines unconditionally, even when they are literal user-facing content rather than OpenClaw scaffolding. Because this sanitizer is applied broadly in outbound delivery paths, legitimate XML/code examples that include these tags on their own lines will be silently altered before send and before queued retry persistence. The unwrap should be gated to an actual generated wrapper pattern (header + matching open/close) instead of removing every matching tag line globally.

Useful? React with 👍 / 👎.

@openclaw-barnacle openclaw-barnacle Bot added docs Improvements or additions to documentation size: L and removed size: M labels May 7, 2026
@steipete steipete force-pushed the fix/subagent-completion-fallback-scaffolding branch from 3772103 to ceea202 Compare May 7, 2026 03:17

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ceea202337

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +94 to +96
if (isPromptDataTagLine(line, "open") || isPromptDataTagLine(line, "close")) {
changed = true;
continue;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Gate prompt-tag stripping on full generated wrapper

unwrapPromptDataWrapperLines removes any standalone <prompt-data> / </prompt-data> (and legacy <untrusted-text>) lines even when there is no generated OpenClaw wrapper header, so user-authored examples/XML snippets that place these tags on their own lines are silently altered before outbound delivery and queue persistence. This should only unwrap when a full generated wrapper pattern is detected (header + matching open/close), otherwise legitimate content is corrupted.

Useful? React with 👍 / 👎.

@steipete steipete merged commit 92284bc into main May 7, 2026
124 of 126 checks passed
@steipete steipete deleted the fix/subagent-completion-fallback-scaffolding branch May 7, 2026 03:30
@steipete

steipete commented May 7, 2026

Copy link
Copy Markdown
Contributor Author

Landed via squash merge.

yozakura-ava added a commit to yozakura-ava/openclaw that referenced this pull request May 7, 2026
…eway fallthrough

When a subagent completes and the parent session has an active but
non-consuming embedded Pi run (between turns, idle), the completion
announcement was silently dropped instead of being delivered.

The early return at the 'if (requesterActivity.isActive)' block returned
{ delivered: false } as a dead-end, preventing fallthrough to the
requester-agent handoff (callGateway with expectFinal: true) that
exists later in the function.

Removing the early return allows the code to reach callGateway, which
starts a proper new agent turn that rewrites and delivers the child
result through the requester session — preserving the delivery contract
established by PR openclaw#78700.

No new code, types, or dependencies. The callGateway path was always
there; we just stopped blocking it.

Fixes openclaw#79053
Co-Authored-By: Paperclip <noreply@paperclip.ing>
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026
* fix(agents): clean subagent completion fallback scaffolding

* refactor(agents): use prompt data blocks for child results

* fix(agents): satisfy sanitizer lint

* refactor(agents): remove raw subagent completion fallback
rogerdigital pushed a commit to rogerdigital/openclaw that referenced this pull request May 9, 2026
* fix(agents): clean subagent completion fallback scaffolding

* refactor(agents): use prompt data blocks for child results

* fix(agents): satisfy sanitizer lint

* refactor(agents): remove raw subagent completion fallback
yozakura-ava added a commit to yozakura-ava/openclaw that referenced this pull request May 10, 2026
…eway fallthrough

When a subagent completes and the parent session has an active but
non-consuming embedded Pi run (between turns, idle), the completion
announcement was silently dropped instead of being delivered.

The early return at the 'if (requesterActivity.isActive)' block returned
{ delivered: false } as a dead-end, preventing fallthrough to the
requester-agent handoff (callGateway with expectFinal: true) that
exists later in the function.

Removing the early return allows the code to reach callGateway, which
starts a proper new agent turn that rewrites and delivers the child
result through the requester session — preserving the delivery contract
established by PR openclaw#78700.

No new code, types, or dependencies. The callGateway path was always
there; we just stopped blocking it.

Fixes openclaw#79053
Co-Authored-By: Paperclip <noreply@paperclip.ing>
lykeion-dev pushed a commit to lykeion-dev/openclaw--rev that referenced this pull request May 14, 2026
* fix(agents): clean subagent completion fallback scaffolding

* refactor(agents): use prompt data blocks for child results

* fix(agents): satisfy sanitizer lint

* refactor(agents): remove raw subagent completion fallback
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 24, 2026
* fix(agents): clean subagent completion fallback scaffolding

* refactor(agents): use prompt data blocks for child results

* fix(agents): satisfy sanitizer lint

* refactor(agents): remove raw subagent completion fallback
jameslcowan pushed a commit to jameslcowan/openclaw that referenced this pull request Jun 2, 2026
* fix(agents): clean subagent completion fallback scaffolding

* refactor(agents): use prompt data blocks for child results

* fix(agents): satisfy sanitizer lint

* refactor(agents): remove raw subagent completion fallback
sablehead pushed a commit to sablehead/openclaw that referenced this pull request Jun 10, 2026
* fix(agents): clean subagent completion fallback scaffolding

* refactor(agents): use prompt data blocks for child results

* fix(agents): satisfy sanitizer lint

* refactor(agents): remove raw subagent completion fallback
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling docs Improvements or additions to documentation maintainer Maintainer-authored PR size: L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Telegram subagent completion fallback can queue raw child/internal output after mediated announce failure

1 participant