Skip to content

fix(channels): suppress late raw tool output#84178

Closed
VACInc wants to merge 8 commits into
openclaw:mainfrom
VACInc:fix-channel-late-tool-output
Closed

fix(channels): suppress late raw tool output#84178
VACInc wants to merge 8 commits into
openclaw:mainfrom
VACInc:fix-channel-late-tool-output

Conversation

@VACInc

@VACInc VACInc commented May 19, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Removes the follow-up regex patch in src/agents/pi-embedded-runner/run/payloads.ts; that code path is shared by Codex, but the regex was the wrong fix because the live repro was a terminal fallback warning, not a missing acknowledgement phrase.
  • Suppresses terminal lastToolError warning payloads for regular verbose turns only when compact progress is actually visible to the channel.
  • Keeps verbose full behavior intact, so full verbose can still show terminal tool-error detail.
  • Keeps the existing Telegram/Discord late-progress guards from this PR: text-only tool output is dropped after final delivery, while media and exec approval payloads remain deliverable.

Root Cause

Before/RCA proof:

  • Live repro after the previous regex patch: the reported private Telegram topic ran through the Codex app-server path on May 19, 2026 at about 15:42 EDT. The trajectory showed a native bash/grep tool call, a failed tool.result (grep: /tmp/openclaw-intentional-failure-demo-round-five: No such file or directory), and then a successful assistant final: Done — intentional \grep` failure, missing `/tmp` file.`
  • Live send proof from the same run: gateway journal showed the final answer followed by another outbound message to the same topic. That extra post-final message was the bad failed-tool dump the user saw.
  • Source path proof: extensions/codex/src/app-server/event-projector.ts records failed native Codex tool items as lastToolError; src/auto-reply/reply/agent-runner-execution.ts passes params.opts?.suppressToolErrorWarnings into the embedded runner; src/agents/pi-embedded-runner/run.ts passes it into buildEmbeddedRunPayloads; and src/agents/pi-embedded-runner/run/payloads.ts appends a terminal warning when lastToolError remains after a user-facing assistant reply.
  • Why the prior fix missed: commit dbe415b907 only broadened acknowledgement text matching for exec-like failures such as exit 1. The later live repro acknowledged the failure in natural language but still produced the terminal fallback, proving the right boundary is not another text regex.
  • Correct boundary proof: src/auto-reply/reply/dispatch-from-config.ts already knows whether the channel will show regular compact verbose progress and whether the user selected full. That is the place to tell the runner that terminal tool-error fallback payloads are redundant for regular verbose, while preserving full.
  • Scope proof: this is not Telegram-specific and not Pi-specific. The affected path is the shared Codex/app-server embedded runner plus channel dispatch options; Telegram and Discord are the visible channel surfaces where late text-only progress can appear as extra chat messages.

Real behavior proof

Behavior addressed: Failed tool calls still appear as compact regular verbose progress before/finalizing the reply, but the terminal failed-tool warning payload is not appended after the final chat response. Full verbose still keeps terminal error warnings available.

Real environment tested: Tmp worktree branch fix-channel-late-tool-output at head 42fccfb3eb550e80cc46951dfc41b36912163ce3; live gateway checkout updated with PR #84178 overlay on /home/vac/openclaw; private Telegram topic/session identifiers are intentionally not copied into the public PR body.

Exact steps or command run after this patch:

  • git revert --no-edit dbe415b907821e97b280018b259068ec6919cd0e
  • pnpm exec oxfmt --check --threads=1 src/auto-reply/reply/dispatch-from-config.ts src/auto-reply/reply/dispatch-from-config.test.ts
  • git diff --check
  • node scripts/run-vitest.mjs src/auto-reply/reply/dispatch-from-config.test.ts
  • node scripts/run-vitest.mjs src/auto-reply/reply/dispatch-from-config.test.ts src/auto-reply/reply/agent-runner-execution.test.ts extensions/codex/src/app-server/event-projector.test.ts
  • git push vacopenclaw fix-channel-late-tool-output
  • oc-update --skip-codex
  • systemctl --user status openclaw-gateway --no-pager -l | sed -n '1,80p'
  • rg -n "suppressToolErrorWarnings" dist | head -20
  • openclaw health

Evidence after fix:

  • Revert proof: head contains acc513bdcb Revert "fix(replies): avoid duplicate exec failure warnings", so the Pi-named regex patch was removed.
  • Formatting: All matched files use the correct format. Finished in 19ms on 2 files using 1 threads.
  • Whitespace: git diff --check produced no output.
  • Focused dispatch test: Test Files 1 passed (1); Tests 132 passed (132); Duration 5.50s.
  • Scoped regression test: Test Files 3 passed (3); Tests 295 passed (295); Duration 7.40s.
  • Production update: oc-update --skip-codex applied #84178 fix(channels): suppress late raw tool output, built core/UI, reinstalled the daemon, and health check completed.
  • Loaded service proof: openclaw-gateway.service is active since Tue 2026-05-19 15:55:51 EDT, running /home/vac/openclaw/dist/index.js gateway --port 18789.
  • Built artifact proof: dist/dispatch-8nUbnUg_.js contains const suppressToolErrorWarnings = params.replyOptions?.suppressToolErrorWarnings ?? (hasVisibleRegularVerboseToolProgress ? true : void 0); and passes suppressToolErrorWarnings into the resolver options.
  • Health proof after reload: Gateway event loop: ok max=648ms p99=22ms util=0.064 cpu=0.121; Telegram and Discord both configured.
  • Existing visual proof: Telegram DM after-fix proof
  • Latest private-topic session file timestamp observed after reload check: the latest topic transcript was last modified at 2026-05-19 15:53:21 EDT, before the corrected live reload at 15:55:51 EDT; no new post-reload topic turn had occurred at the time of this proof update.
  • No codex-review rerun was performed after the maintainer explicitly instructed not to run it again.

Observed result after fix: Regular verbose now tells the shared Codex embedded runner to suppress terminal tool-error warning fallback payloads whenever compact progress is visible, so failed tools stay in the compact progress lane instead of being dumped after the final answer. Verbose full is excluded by test and code condition.

What was not tested: A fresh post-reload live Telegram or Discord repro turn was not run by the agent after oc-update; the gateway is loaded for maintainer retest. The security audit warnings from oc-update are pre-existing local configuration/skill warnings and unrelated to this PR.

Verification

  • pnpm exec oxfmt --check --threads=1 src/auto-reply/reply/dispatch-from-config.ts src/auto-reply/reply/dispatch-from-config.test.ts
  • git diff --check
  • node scripts/run-vitest.mjs src/auto-reply/reply/dispatch-from-config.test.ts
  • node scripts/run-vitest.mjs src/auto-reply/reply/dispatch-from-config.test.ts src/auto-reply/reply/agent-runner-execution.test.ts extensions/codex/src/app-server/event-projector.test.ts
  • oc-update --skip-codex
  • openclaw health

What was not tested

A fresh post-reload live Telegram or Discord repro turn was not run by the agent. The corrected build is running live for maintainer retest.

@openclaw-barnacle openclaw-barnacle Bot added channel: discord Channel integration: discord channel: telegram Channel integration: telegram agents Agent runtime and tooling triage: mock-only-proof Candidate: PR proof only shows tests, mocks, snapshots, lint, typecheck, or CI. size: M labels May 19, 2026
@clawsweeper

clawsweeper Bot commented May 19, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge.

Workflow note: Future ClawSweeper reviews update this same comment in place.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

Summary
This PR suppresses late text-only tool/progress output after final reply delivery across shared dispatch, embedded runner, Codex projection, Telegram, and Discord while reserving raw tool-error detail for verbose full and updating docs/tests.

Reproducibility: yes. at source level: current main can still forward failed text-only tool output and terminal verbose error detail after final delivery, and the PR adds regression tests for those paths. I did not run a fresh live Telegram or Discord current-main repro in this read-only review.

PR rating
Overall: 🦐 gold shrimp
Proof: 🦐 gold shrimp
Patch quality: 🐚 platinum hermit
Summary: Patch quality looks solid from source review, but current-head real behavior proof is still incomplete for a visible Telegram/Discord delivery change.

Rank-up moves:

  • Add a redacted current-head Telegram screenshot, recording, or live-output transcript showing the failed-tool repro no longer posts raw output after the final answer.
  • Add Discord live proof if maintainers want parity beyond the focused Discord regression tests.
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

PR egg
🎁 Pass real behavior proof to wake the egg and unlock a hatchable treat.

Where did the egg go?
  • The egg game starts only after the PR passes the real-behavior proof check.
  • Before that, no creature, rarity, or ASCII portrait is rolled. The treat waits for real proof.
  • This is still just collectible flavor: proof affects review readiness, not creature quality.

Real behavior proof
Needs stronger real behavior proof before merge: The PR has logs, tests, deployment output, and an older Telegram screenshot, but the body says no fresh post-reload live Telegram or Discord repro was run on the final corrected head; add redacted current-head proof and update the PR body for re-review.

Mantis proof suggestion
A native Telegram recording would directly prove the current-head post-final transcript behavior that remains untested after the last correction. A maintainer can ask Mantis to capture proof by posting a new PR comment that starts with the OpenClaw Mantis account mention, followed by:

telegram desktop proof: verify on current PR head that a failed exec in regular verbose shows compact progress/final acknowledgement and no raw or duplicate failure warning appears after the final Telegram answer.

Risk before merge
Why this matters: - The latest head changed after the earlier screenshot proof, and the PR body explicitly says no fresh post-reload Telegram or Discord repro turn was run after the corrected reload.

  • Regular /verbose on users lose raw failed-tool detail unless they switch to /verbose full, which is an intentional but compatibility-visible diagnostics change.
  • The suppression is shared across dispatch plus Telegram and Discord timing, so maintainers should verify it does not hide legitimate late text-only progress in real transports.

Maintainer options:

  1. Require current-head Telegram proof (recommended)
    Ask for a redacted live Telegram run on 42fccfb3eb550e80cc46951dfc41b36912163ce3 showing compact failed-tool progress, the final acknowledgement, and no post-final raw warning.
  2. Accept the diagnostics compatibility change
    Maintainers can land with tests plus the documented /verbose full escape hatch if they intentionally accept that regular verbose no longer exposes raw failed-tool detail.
  3. Pause for compatibility design
    If regular verbose must keep raw failed-tool diagnostics for existing operators, pause this PR and require a compatibility-preserving option or narrower suppression rule.

Next step before merge
A maintainer explicitly stopped automerge and left this for human review; the remaining blockers are proof and compatibility acceptance rather than a narrow repair.

Security
Cleared: The diff does not touch dependencies, CI, credentials, auth, permissions, release scripts, or other supply-chain surfaces.

Review details

Best possible solution:

Land a dispatch-owned fix only after maintainers accept the regular-verbose compatibility change and see current-head live transport proof that failed tools remain compact without post-final raw dumps.

Do we have a high-confidence way to reproduce the issue?

Yes at source level: current main can still forward failed text-only tool output and terminal verbose error detail after final delivery, and the PR adds regression tests for those paths. I did not run a fresh live Telegram or Discord current-main repro in this read-only review.

Is this the best way to solve the issue?

Mostly yes: dispatch is the right boundary because it knows whether compact progress is visible and whether verbose is full. The remaining decision is whether the regular-verbose compatibility change is acceptable, plus current-head live transport proof.

Label justifications:

  • P2: The PR addresses a real channel/agent delivery bug with limited blast radius and no evidence of an emergency runtime outage.
  • merge-risk: 🚨 compatibility: The diff intentionally changes /verbose on from raw failed-tool detail to compact summaries unless users select /verbose full.
  • merge-risk: 🚨 message-delivery: The diff suppresses late text-only tool/progress payloads after final delivery across shared dispatch, Telegram, and Discord paths.

What I checked:

  • Current main still sends text-only error tool results under preview suppression: On current main, onToolResult only drops default tool progress when there is no media, no exec approval, and deliveryPayload.isError !== true, so failed text output can still be sent after final delivery. (src/auto-reply/reply/dispatch-from-config.ts:1625, 3d96111a5afe)
  • Current main treats normal verbose as raw tool-error detail mode: isVerboseToolDetailEnabled returns true for both on and full, which matches the compatibility-sensitive behavior the PR changes. (src/agents/pi-embedded-runner/run/payloads.ts:103, 3d96111a5afe)
  • Current Codex projection does not mark failed output callbacks as errors: Current main emits formatted tool output through onToolResult without carrying an isError flag, while the PR diff adds that flag for non-success tool statuses. (extensions/codex/src/app-server/event-projector.ts:1152, 3d96111a5afe)
  • PR head moves suppression to dispatch and preserves full verbose fallback: The PR head adds suppressToolErrorWarnings only when regular verbose progress is visible, suppresses late text-only tool progress after final delivery starts, and leaves full verbose without that terminal-warning suppression. (src/auto-reply/reply/dispatch-from-config.ts:1569, 42fccfb3eb55)
  • Regression coverage is broad but still not transport proof: The PR adds focused unit coverage for late post-final suppression, message-tool-only progress suppression, full verbose behavior, Telegram, Discord, and Codex failed-output marking. (src/auto-reply/reply/dispatch-from-config.test.ts:1901, 42fccfb3eb55)
  • Telegram maintainer note raises the proof bar: The local Telegram maintainer note says Telegram transport, streaming, topics, callbacks, authorization, or reply-context PRs need real Telegram proof, preferably a live probe. (.agents/maintainer-notes/telegram.md:35, 3d96111a5afe)

Likely related people:

  • Josh Avant: Blame and file history show Guard final delivery session refresh owning the current dispatch, payload, Codex projection, Discord, and Telegram code that this PR changes. (role: recent area contributor; confidence: high; commits: e99615973810; files: src/auto-reply/reply/dispatch-from-config.ts, src/agents/pi-embedded-runner/run/payloads.ts, extensions/codex/src/app-server/event-projector.ts)
  • Patrick Erichsen: Recent history and blame show Add Telegram progress preview flows contributing to the Telegram and Discord preview-progress surfaces involved in the late-output behavior. (role: recent progress-preview contributor; confidence: medium; commits: d60ab485114a; files: extensions/telegram/src/bot-message-dispatch.ts, extensions/discord/src/monitor/message-handler.draft-preview.ts)

Codex review notes: model gpt-5.5, reasoning high; reviewed against 3d96111a5afe.

@clawsweeper clawsweeper Bot added rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. mantis: telegram-visible-proof Mantis should capture Telegram visible proof. P2 Normal backlog priority with limited blast radius. merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. labels May 19, 2026
@VACInc VACInc marked this pull request as ready for review May 19, 2026 15:23

VACInc commented May 19, 2026

Copy link
Copy Markdown
Contributor Author

Production/live proof update:

  • Head SHA: 7b76ed87e0eb4d72a13b60921b2cda30e684dda7
  • The PR is now ready for review.
  • We are running the fix live in production, and the late raw failed-tool output has not reoccurred since.
  • Telegram DM after-fix proof screenshot:

Telegram DM after-fix proof

Focused validation rerun after the review fixes:

pnpm exec oxfmt --check --threads=1 <13 touched files>
All matched files use the correct format. Finished in 41ms on 13 files using 1 threads.

git diff --check HEAD~2..HEAD
(no output)

node scripts/run-vitest.mjs src/auto-reply/reply/dispatch-from-config.test.ts src/auto-reply/reply/agent-runner-execution.test.ts src/agents/pi-embedded-subscribe.subscribe-embedded-pi-session.suppresses-message-end-block-replies-message-tool.test.ts extensions/discord/src/monitor/message-handler.process.test.ts extensions/telegram/src/bot-message-dispatch.test.ts
Test Files 6 passed (6); Tests 400 passed (400); Duration 10.95s

Codex review was rerun with codex -s danger-full-access -a never --disable plugins review --commit HEAD; its three actionable P2 findings were accepted and fixed in the current head.

@openclaw-barnacle openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: mock-only-proof Candidate: PR proof only shows tests, mocks, snapshots, lint, typecheck, or CI. labels May 19, 2026
@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels May 19, 2026
@openclaw-barnacle openclaw-barnacle Bot added docs Improvements or additions to documentation and removed proof: sufficient ClawSweeper judged the real behavior proof convincing. labels May 19, 2026
@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. labels May 19, 2026
@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. proof: 📸 screenshot Contributor real behavior proof includes screenshot evidence. labels May 19, 2026
@Takhoffman

Copy link
Copy Markdown
Contributor

@clawsweeper automerge

@clawsweeper clawsweeper Bot added the clawsweeper:automerge Maintainer opted this PR into bounded ClawSweeper-reviewed automerge label May 19, 2026
@clawsweeper

clawsweeper Bot commented May 19, 2026

Copy link
Copy Markdown
Contributor

🦞🔧
ClawSweeper automerge is enabled.

Draft PRs stay fix-only until GitHub marks them ready for review. Pause with /clawsweeper stop.

Automerge progress:

  • 2026-05-19 18:55:24 UTC review queued 0f97dc43635c (queued)

@Takhoffman

Copy link
Copy Markdown
Contributor

@clawsweeper stop

@clawsweeper clawsweeper Bot added clawsweeper:human-review Needs maintainer review before ClawSweeper can continue and removed clawsweeper:automerge Maintainer opted this PR into bounded ClawSweeper-reviewed automerge labels May 19, 2026
@clawsweeper

clawsweeper Bot commented May 19, 2026

Copy link
Copy Markdown
Contributor

🦞✅
Got it. ClawSweeper will leave this item for human review.

I added clawsweeper:human-review, removed clawsweeper:automerge, and paused the automation trail until a maintainer asks again.

@VACInc

VACInc commented May 19, 2026

Copy link
Copy Markdown
Contributor Author

Follow-up proof for head dbe415b907821e97b280018b259068ec6919cd0e:

Root Cause

Before/RCA proof:

  • The prior patch was loaded in the live gateway: the running gateway process was using built dist/index.js, and both source/dist contained the previous suppression markers for isError tool results and regular-vs-full verbose gating.
  • The repro was therefore not stale code. The reported Telegram topic was on regular verbose (verboseDefault: "on", no per-session verboseLevel: "full").
  • The latest live repro produced a failed bash/cat tool result, then the final assistant text acknowledged it as exit 1, but the session still emitted another post-final failed-tool warning.
  • The missed path was src/agents/pi-embedded-runner/run/payloads.ts: final payload assembly still appends a synthetic mutating-tool warning unless it detects that the assistant already acknowledged the failed action. That detector handled phrases like “couldn't run the command,” but did not handle concise terminal-status acknowledgements like exit 1.

Real behavior proof

Behavior addressed: In regular verbose mode, when the final assistant answer already acknowledges an exec/bash failed exit status, OpenClaw should not append a second post-final failed-tool warning. Full verbose remains the raw/detail mode.

Real environment tested: Local tmp PR worktree on branch fix-channel-late-tool-output, head dbe415b907821e97b280018b259068ec6919cd0e, plus inspection of the live gateway/session that reproduced after the prior patch was loaded.

Exact steps or command run after this patch:

  • pnpm exec oxfmt --check --threads=1 src/agents/pi-embedded-runner/run/payloads.ts src/agents/pi-embedded-runner/run/payloads.errors.test.ts
  • git diff --check
  • node scripts/run-vitest.mjs src/agents/pi-embedded-runner/run/payloads.errors.test.ts src/auto-reply/reply/dispatch-from-config.test.ts extensions/codex/src/app-server/event-projector.test.ts

Evidence after fix:

  • oxfmt: All matched files use the correct format. Finished in 7ms on 2 files using 1 threads.
  • git diff --check: no output.
  • Vitest: Test Files 3 passed (3); Tests 235 passed (235); Duration 8.15s.
  • New regression coverage proves the live-style final text Done — intentional missing-file check, exit \1`.suppresses the duplicatebash` warning in regular verbose mode.
  • New negative coverage proves unrelated failure text such as The tests failed, so I stopped there. still surfaces write and bash warnings instead of hiding real tool failures.

Observed result after fix: The payload builder now returns only the final assistant payload for the live-style exit 1 acknowledgement, while preserving compact warnings for unrelated/vague failure text.

What was not tested: A fresh live Telegram topic send after dbe415b907821e97b280018b259068ec6919cd0e has not been run yet; I am running oc-update --skip-codex next so the live gateway can be tested against this exact head.

@VACInc

VACInc commented May 19, 2026

Copy link
Copy Markdown
Contributor Author

Deployment/update proof for head dbe415b907821e97b280018b259068ec6919cd0e:

  • PR is open/ready for review, not draft.
  • oc-update --skip-codex applied PR fix(channels): suppress late raw tool output #84178 on top of current origin/main, installed dependencies, rebuilt core/UI, reinstalled the daemon, and restarted the gateway.
  • Live gateway is now active since 2026-05-19 15:38:27 EDT with /home/vac/openclaw/dist/index.js gateway --port 18789.
  • Loaded-code proof: both source and built dist contain EXEC_FAILURE_TERMINAL_STATUS_PATTERN and the hasExplicitMutatingToolFailureAcknowledgement(cleanedText, params.lastToolError?.toolName) call.
  • We are running the fix live in production. Since this restart/update, I have not observed the post-final failed-tool dump recur.
  • Known unrelated update output: openclaw security audit still reports pre-existing local security/trust-model warnings, including reddit-modmail; those were not introduced by this PR.

@clawsweeper clawsweeper Bot added rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. and removed proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. labels May 19, 2026
@VACInc

VACInc commented May 19, 2026

Copy link
Copy Markdown
Contributor Author

Correction pushed and live.

What changed in this update:

  • Removed dbe415b907 via acc513bdcb because the Pi-named payload regex was the wrong fix shape for this repro.
  • Added 42fccfb3eb: dispatch-from-config now passes suppressToolErrorWarnings only when regular verbose compact progress is visible, while preserving verbose full.

RCA proof:

  • The post-regex live repro was a Codex app-server run with a failed native bash/grep tool result, followed by a successful final answer, then an extra outbound message. That points to the shared terminal lastToolError warning fallback, not a Telegram adapter-only dump and not a missing exit 1 acknowledgement string.
  • Source proof: Codex projector records lastToolError; agent-runner-execution passes suppressToolErrorWarnings; run.ts passes it to buildEmbeddedRunPayloads; payloads append the terminal warning after final if not suppressed.

Fix proof:

  • pnpm exec oxfmt --check --threads=1 src/auto-reply/reply/dispatch-from-config.ts src/auto-reply/reply/dispatch-from-config.test.ts passed.
  • git diff --check passed with no output.
  • node scripts/run-vitest.mjs src/auto-reply/reply/dispatch-from-config.test.ts passed: 1 file, 132 tests.
  • node scripts/run-vitest.mjs src/auto-reply/reply/dispatch-from-config.test.ts src/auto-reply/reply/agent-runner-execution.test.ts extensions/codex/src/app-server/event-projector.test.ts passed: 3 files, 295 tests.
  • No codex-review rerun was performed per maintainer instruction.

Live production proof:

  • oc-update --skip-codex applied PR fix(channels): suppress late raw tool output #84178, built core/UI, reinstalled/restarted the gateway, and health check completed.
  • openclaw-gateway.service is active since Tue 2026-05-19 15:55:51 EDT, running /home/vac/openclaw/dist/index.js gateway --port 18789.
  • Built artifact contains the corrected dispatch logic: dist/dispatch-8nUbnUg_.js has the regular-verbose suppressToolErrorWarnings handoff.
  • openclaw health after reload: gateway event loop ok; Telegram and Discord configured.
  • No post-reload recurrence observed at comment time; the latest reported-topic transcript timestamp I found was 2026-05-19 15:53:21 EDT, before this corrected reload.

Security audit warnings from oc-update are pre-existing local config/skill warnings, unrelated to this PR.

@clawsweeper clawsweeper Bot removed the proof: 📸 screenshot Contributor real behavior proof includes screenshot evidence. label May 19, 2026

VACInc commented May 19, 2026

Copy link
Copy Markdown
Contributor Author

Closing this overloaded proof thread as superseded. I am opening a replacement PR from the same head/branch with the same code, full updated RCA, and the converted before/after PNG proof; I will link it back here once GitHub creates it.

VACInc commented May 19, 2026

Copy link
Copy Markdown
Contributor Author

Replacement PR: #84303

It uses the same branch/head with no code changes, and carries the cleaned full RCA plus the before/after PNG proof.

@VACInc

VACInc commented May 28, 2026

Copy link
Copy Markdown
Contributor Author

@clawsweeper hatch

@clawsweeper

clawsweeper Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

🦞👀
ClawSweeper could not hatch this PR egg yet.

Reason: hatch requires an open pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling channel: discord Channel integration: discord channel: telegram Channel integration: telegram clawsweeper:human-review Needs maintainer review before ClawSweeper can continue docs Improvements or additions to documentation extensions: codex mantis: telegram-visible-proof Mantis should capture Telegram visible proof. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. P2 Normal backlog priority with limited blast radius. proof: supplied External PR includes structured after-fix real behavior proof. rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. size: L status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants