fix(channels): suppress late raw tool output#84178
Conversation
|
Codex review: needs real behavior proof before merge. Workflow note: Future ClawSweeper reviews update this same comment in place. How this review workflow works
Summary Reproducibility: yes. at source level: current main can still forward failed text-only tool output and terminal verbose error detail after final delivery, and the PR adds regression tests for those paths. I did not run a fresh live Telegram or Discord current-main repro in this read-only review. PR rating Rank-up moves:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. PR egg Where did the egg go?
Real behavior proof Mantis proof suggestion Risk before merge
Maintainer options:
Next step before merge Security Review detailsBest possible solution: Land a dispatch-owned fix only after maintainers accept the regular-verbose compatibility change and see current-head live transport proof that failed tools remain compact without post-final raw dumps. Do we have a high-confidence way to reproduce the issue? Yes at source level: current main can still forward failed text-only tool output and terminal verbose error detail after final delivery, and the PR adds regression tests for those paths. I did not run a fresh live Telegram or Discord current-main repro in this read-only review. Is this the best way to solve the issue? Mostly yes: dispatch is the right boundary because it knows whether compact progress is visible and whether verbose is full. The remaining decision is whether the regular-verbose compatibility change is acceptable, plus current-head live transport proof. Label justifications:
What I checked:
Likely related people:
Codex review notes: model gpt-5.5, reasoning high; reviewed against 3d96111a5afe. |
|
Production/live proof update:
Focused validation rerun after the review fixes: Codex review was rerun with |
|
@clawsweeper automerge |
|
🦞🔧
Draft PRs stay fix-only until GitHub marks them ready for review. Pause with Automerge progress:
|
|
@clawsweeper stop |
|
🦞✅ I added |
|
Follow-up proof for head Root CauseBefore/RCA proof:
Real behavior proofBehavior addressed: In regular verbose mode, when the final assistant answer already acknowledges an exec/bash failed exit status, OpenClaw should not append a second post-final failed-tool warning. Full verbose remains the raw/detail mode. Real environment tested: Local tmp PR worktree on branch Exact steps or command run after this patch:
Evidence after fix:
Observed result after fix: The payload builder now returns only the final assistant payload for the live-style What was not tested: A fresh live Telegram topic send after |
|
Deployment/update proof for head
|
|
Correction pushed and live. What changed in this update:
RCA proof:
Fix proof:
Live production proof:
Security audit warnings from |
|
Closing this overloaded proof thread as superseded. I am opening a replacement PR from the same head/branch with the same code, full updated RCA, and the converted before/after PNG proof; I will link it back here once GitHub creates it. |
|
Replacement PR: #84303 It uses the same branch/head with no code changes, and carries the cleaned full RCA plus the before/after PNG proof. |
|
@clawsweeper hatch |
|
🦞👀 Reason: hatch requires an open pull request. |

Summary
src/agents/pi-embedded-runner/run/payloads.ts; that code path is shared by Codex, but the regex was the wrong fix because the live repro was a terminal fallback warning, not a missing acknowledgement phrase.lastToolErrorwarning payloads for regular verbose turns only when compact progress is actually visible to the channel.fullbehavior intact, so full verbose can still show terminal tool-error detail.Root Cause
Before/RCA proof:
bash/greptool call, a failedtool.result(grep: /tmp/openclaw-intentional-failure-demo-round-five: No such file or directory), and then a successful assistant final:Done — intentional \grep` failure, missing `/tmp` file.`extensions/codex/src/app-server/event-projector.tsrecords failed native Codex tool items aslastToolError;src/auto-reply/reply/agent-runner-execution.tspassesparams.opts?.suppressToolErrorWarningsinto the embedded runner;src/agents/pi-embedded-runner/run.tspasses it intobuildEmbeddedRunPayloads; andsrc/agents/pi-embedded-runner/run/payloads.tsappends a terminal warning whenlastToolErrorremains after a user-facing assistant reply.dbe415b907only broadened acknowledgement text matching for exec-like failures such asexit 1. The later live repro acknowledged the failure in natural language but still produced the terminal fallback, proving the right boundary is not another text regex.src/auto-reply/reply/dispatch-from-config.tsalready knows whether the channel will show regular compact verbose progress and whether the user selectedfull. That is the place to tell the runner that terminal tool-error fallback payloads are redundant for regular verbose, while preservingfull.Real behavior proof
Behavior addressed: Failed tool calls still appear as compact regular verbose progress before/finalizing the reply, but the terminal failed-tool warning payload is not appended after the final chat response. Full verbose still keeps terminal error warnings available.
Real environment tested: Tmp worktree branch
fix-channel-late-tool-outputat head42fccfb3eb550e80cc46951dfc41b36912163ce3; live gateway checkout updated with PR #84178 overlay on/home/vac/openclaw; private Telegram topic/session identifiers are intentionally not copied into the public PR body.Exact steps or command run after this patch:
git revert --no-edit dbe415b907821e97b280018b259068ec6919cd0epnpm exec oxfmt --check --threads=1 src/auto-reply/reply/dispatch-from-config.ts src/auto-reply/reply/dispatch-from-config.test.tsgit diff --checknode scripts/run-vitest.mjs src/auto-reply/reply/dispatch-from-config.test.tsnode scripts/run-vitest.mjs src/auto-reply/reply/dispatch-from-config.test.ts src/auto-reply/reply/agent-runner-execution.test.ts extensions/codex/src/app-server/event-projector.test.tsgit push vacopenclaw fix-channel-late-tool-outputoc-update --skip-codexsystemctl --user status openclaw-gateway --no-pager -l | sed -n '1,80p'rg -n "suppressToolErrorWarnings" dist | head -20openclaw healthEvidence after fix:
acc513bdcb Revert "fix(replies): avoid duplicate exec failure warnings", so the Pi-named regex patch was removed.All matched files use the correct format. Finished in 19ms on 2 files using 1 threads.git diff --checkproduced no output.Test Files 1 passed (1); Tests 132 passed (132); Duration 5.50s.Test Files 3 passed (3); Tests 295 passed (295); Duration 7.40s.oc-update --skip-codexapplied#84178 fix(channels): suppress late raw tool output, built core/UI, reinstalled the daemon, and health check completed.openclaw-gateway.serviceis active sinceTue 2026-05-19 15:55:51 EDT, running/home/vac/openclaw/dist/index.js gateway --port 18789.dist/dispatch-8nUbnUg_.jscontainsconst suppressToolErrorWarnings = params.replyOptions?.suppressToolErrorWarnings ?? (hasVisibleRegularVerboseToolProgress ? true : void 0);and passessuppressToolErrorWarningsinto the resolver options.Gateway event loop: ok max=648ms p99=22ms util=0.064 cpu=0.121; Telegram and Discord both configured.2026-05-19 15:53:21 EDT, before the corrected live reload at15:55:51 EDT; no new post-reload topic turn had occurred at the time of this proof update.Observed result after fix: Regular verbose now tells the shared Codex embedded runner to suppress terminal tool-error warning fallback payloads whenever compact progress is visible, so failed tools stay in the compact progress lane instead of being dumped after the final answer. Verbose
fullis excluded by test and code condition.What was not tested: A fresh post-reload live Telegram or Discord repro turn was not run by the agent after
oc-update; the gateway is loaded for maintainer retest. The security audit warnings fromoc-updateare pre-existing local configuration/skill warnings and unrelated to this PR.Verification
pnpm exec oxfmt --check --threads=1 src/auto-reply/reply/dispatch-from-config.ts src/auto-reply/reply/dispatch-from-config.test.tsgit diff --checknode scripts/run-vitest.mjs src/auto-reply/reply/dispatch-from-config.test.tsnode scripts/run-vitest.mjs src/auto-reply/reply/dispatch-from-config.test.ts src/auto-reply/reply/agent-runner-execution.test.ts extensions/codex/src/app-server/event-projector.test.tsoc-update --skip-codexopenclaw healthWhat was not tested
A fresh post-reload live Telegram or Discord repro turn was not run by the agent. The corrected build is running live for maintainer retest.