fix: suppress raw provider errors in channel delivery by jason-allen-oneal · Pull Request #88610 · openclaw/openclaw

jason-allen-oneal · 2026-05-31T12:16:20Z

Summary

This change blocks raw unclassified assistant errorMessage text at the user-facing delivery boundary.

It adds a shared formatUserFacingAssistantErrorText(...) wrapper around the existing assistant error formatter. Classified friendly errors still use the existing sanitizer/classification behavior, but formatter output that is only raw upstream pass-through now falls back to:

LLM request failed.

Terminal lifecycle events and reply payload construction now use the safe wrapper for actual assistant error turns. Raw diagnostic details are not globally deleted, provider-side error objects are not changed, and internal observation fields such as rawErrorPreview remain available for maintainers.

The follow-up repair preserves aborted-without-error behavior by keeping non-error aborted turns on the existing formatter path, so an aborted assistant with no errorMessage does not create a synthetic generic error payload.

Evidence

Synthetic canary: SECRET_CANARY_69737
Current HEAD tested: 5bb04e6997717541b9c0ede4f0abf7784f49f06f
Raw provider-error suppression remains scoped to stopReason="error".
Aborted-without-error payload behavior is preserved.

Real behavior proof

Behavior addressed: Terminal assistant errors with unclassified raw upstream errorMessage must not send that raw text through user-facing lifecycle or reply payload delivery text. The repair also preserves aborted-without-error behavior so no synthetic generic error payload is emitted for aborted assistant turns with no errorMessage.
Real environment tested: Local OpenClaw checkout at HEAD 5bb04e6997717541b9c0ede4f0abf7784f49f06f, running the actual handleAgentEnd lifecycle function and actual buildEmbeddedRunPayloads reply payload builder through node --import tsx, plus a live Telegram DM through my existing OpenClaw Telegram bot.
Exact steps or command run after this patch: Direct Node/tsx proof imported src/agents/embedded-agent-subscribe.handlers.lifecycle.ts and src/agents/embedded-agent-runner/run/payloads.ts, passed a terminal assistant payload with stopReason: "error", content: [], and errorMessage: "SECRET_CANARY_69737", then captured user-facing lifecycle event text and reply payload text. For Telegram, the PR checkout was started locally, openai/gpt-5.5 was temporarily routed to a local OpenAI Responses-compatible endpoint at 127.0.0.1:8787/v1, and a live Telegram DM triggered a provider 500 containing SECRET_CANARY_69737.
Evidence after fix: Terminal output from the direct proof command and live Telegram gateway log excerpt:

{
  "command": "node --import tsx direct handleAgentEnd plus buildEmbeddedRunPayloads canary proof",
  "head": "5bb04e6997717541b9c0ede4f0abf7784f49f06f",
  "containsCanaryInUserFacingLifecycleText": false,
  "containsGenericFallbackInUserFacingLifecycleText": true,
  "containsCanaryInUserFacingReplyPayloadText": false,
  "containsGenericFallbackInUserFacingReplyPayloadText": true,
  "replyPayloads": [
    {
      "text": "LLM request failed.",
      "isError": true
    }
  ],
  "internalRawPreviewRetainedForMaintainers": true
}

[openai-transport] [responses] error provider=openai api=openai-responses model=gpt-5.5 status=500 code=telegram_live_proof_canary type=server_error message=500 SECRET_CANARY_69737
[telegram] embedded run agent end: isError=true model=gpt-5.5 provider=openai error=LLM request failed. rawError=500 SECRET_CANARY_69737

Observed result after fix: SECRET_CANARY_69737 is absent from both user-facing lifecycle text and user-facing reply payload text. The live Telegram transcript displays a generic failure response and does not show SECRET_CANARY_69737. Internal diagnostic preview retention still reports true for maintainers. The aborted-without-error regression is covered by the added payload regression test and preserves the no-payload behavior.
What was not tested: Slack delivery was not exercised. The direct lifecycle/payload behavior and live Telegram DM delivery path were exercised for this PR scope.

Testing

OPENCLAW_VITEST_MAX_WORKERS=1 node scripts/run-vitest.mjs run src/agents/embedded-agent-helpers.formatassistanterrortext.test.ts src/agents/embedded-agent-runner/run/payloads.errors.test.ts src/agents/embedded-agent-subscribe.handlers.lifecycle.test.ts src/agents/embedded-agent-helpers/errors.test.ts

Result:

Test Files  4 passed (4)
Tests  156 passed (156)

git diff --check upstream/main...HEAD
node_modules/.bin/oxfmt --check --threads=1 --config .oxfmtrc.jsonc src/agents/embedded-agent-helpers/errors.ts src/agents/embedded-agent-helpers.ts src/agents/embedded-agent-subscribe.handlers.lifecycle.ts src/agents/embedded-agent-subscribe.handlers.lifecycle.test.ts src/agents/embedded-agent-runner/run/payloads.ts src/agents/embedded-agent-runner/run/payloads.errors.test.ts

Result:

All matched files use the correct format.

clawsweeper · 2026-05-31T12:18:25Z

Codex review: needs maintainer review before merge. Reviewed May 31, 2026, 2:46 PM ET / 18:46 UTC.

Summary
The PR adds a shared user-facing assistant-error wrapper and routes terminal assistant lifecycle errors and reply payloads through it, with regression tests for raw canary suppression and aborted-run behavior.

PR surface: Source +41, Tests +59. Total +100 across 6 files.

Reproducibility: yes. source-level. Current main can still let unclassified formatter pass-through become lifecycle error text and an isError reply payload, and the PR adds focused canary tests for that path.

Review metrics: 1 noteworthy metric.

User-facing error surfaces: 2 changed. Lifecycle error events and reply payloads now share the safe wrapper, so maintainers should explicitly accept the cross-transport wording change.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🦞 diamond lobster
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

[P2] Maintainers should explicitly accept the generic fallback policy across supported channels before merge.

Mantis proof suggestion
A maintainer-controlled Telegram Desktop proof would independently show the changed visible chat error text, even though the contributor proof is already sufficient. A maintainer can ask Mantis to capture proof by posting a new PR comment that starts with the OpenClaw Mantis account mention, followed by:

telegram desktop proof: verify a forced raw provider error canary is not shown in Telegram and the user sees only generic failure copy.

Risk before merge

[P1] Existing users and operators lose previously visible unclassified provider diagnostic text in channel replies and lifecycle errors; raw details remain in internal observation fields, but the compatibility tradeoff needs maintainer acceptance.
[P1] The change applies at shared lifecycle and reply-payload boundaries, so it changes visible terminal provider-error text across transports; Telegram proof is strong, but Slack and other channels were not live exercised.

Maintainer options:

Accept Generic Fallback (recommended)
Land this if maintainers prefer never exposing unclassified upstream diagnostics to users and are comfortable relying on internal raw previews for debugging.
Require Explicit Diagnostic Policy
Ask for a narrower allowlist or channel policy if some unclassified provider messages should remain visible to end users.
Defer To Broader Customization Work
Pause this PR if maintainers want the remaining answer to be configurable per channel or audience instead of one generic fallback.

Next step before merge

No automated repair is needed; the remaining action is maintainer judgment on the diagnostics compatibility tradeoff and final merge review.

Security
Cleared: No supply-chain or new security-boundary concern was found; the patch reduces user-facing leakage of raw provider errors while retaining internal diagnostics.

Review details

Best possible solution:

Land the boundary-level suppression if maintainers accept generic fallback copy for unclassified raw errors, while preserving classified friendly errors, current-main formatter classifications, and internal raw diagnostics.

Do we have a high-confidence way to reproduce the issue?

Yes, source-level. Current main can still let unclassified formatter pass-through become lifecycle error text and an isError reply payload, and the PR adds focused canary tests for that path.

Is this the best way to solve the issue?

Yes, if maintainers accept the generic fallback policy. A delivery-boundary wrapper is narrower than changing formatAssistantErrorText globally because it keeps internal formatter behavior and raw observation fields available.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against ecbd97e9682d.

Label changes

Label changes:

add proof: sufficient: Contributor real behavior proof is sufficient. The PR body provides after-fix direct Node/tsx output plus live Telegram screenshot/log evidence showing the raw canary is absent from user-facing text and retained internally.

Label justifications:

P2: This is a bounded user-facing security/privacy hardening bug in shared agent channel delivery with normal maintainer priority.
merge-risk: 🚨 compatibility: The PR intentionally replaces previously visible unclassified provider diagnostics with generic copy, which can affect existing support workflows.
merge-risk: 🚨 message-delivery: The diff changes the actual terminal error text delivered through lifecycle events and channel reply payloads.
rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (logs): The PR body provides after-fix direct Node/tsx output plus live Telegram screenshot/log evidence showing the raw canary is absent from user-facing text and retained internally.
proof: sufficient: Contributor real behavior proof is sufficient. The PR body provides after-fix direct Node/tsx output plus live Telegram screenshot/log evidence showing the raw canary is absent from user-facing text and retained internally.
mantis: telegram-visible-proof: Mantis should capture Telegram visible proof. The PR changes Telegram-visible chat error text, so a short Telegram Desktop proof can directly demonstrate that the raw canary is hidden.

Evidence reviewed

PR surface:

Source +41, Tests +59. Total +100 across 6 files.

View PR surface stats

Area	Files	Added	Removed	Net
Source	4	57	16	+41
Tests	2	61	2	+59
Docs	0	0	0	0
Config	0	0	0	0
Generated	0	0	0	0
Other	0	0	0	0
Total	6	118	18	+100

What I checked:

Root policy read: Read the full root policy from disk and applied the review-depth, compatibility, fallback, real-proof, and Telegram-visible proof guidance. (AGENTS.md:1, ecbd97e9682d)
Scoped agent policies read: The scoped agent guidance is mainly test-performance guidance; it supports focused helper/payload/lifecycle tests rather than broad runner bootstraps for this surface. (src/agents/AGENTS.md:1, ecbd97e9682d)
Current main still has raw fallback risk: On current main, unclassified formatter output can still return the raw error string or a 600-character raw slice after classified branches. (src/agents/embedded-agent-helpers/errors.ts:1357, ecbd97e9682d)
Current main lifecycle delivery path is user-facing: Lifecycle error text currently uses formatAssistantErrorText and falls through to lastAssistant.errorMessage before generic copy. (src/agents/embedded-agent-subscribe.handlers.lifecycle.ts:85, ecbd97e9682d)
Current main reply payload path is channel-visible: The payload builder formats terminal assistant errors and pushes the resulting text as an isError reply payload. (src/agents/embedded-agent-runner/run/payloads.ts:302, ecbd97e9682d)
PR adds a boundary wrapper: The proposed merge result adds formatUserFacingAssistantErrorText, which keeps classified formatter output but replaces exact raw pass-through with the generic assistant error text. (src/agents/embedded-agent-helpers/errors.ts:1346, 8c79ead00902)

Likely related people:

steipete: Current shallow/grafted blame for the central lifecycle, payload, and formatter lines points to the repository snapshot commit, so this is weak but useful routing context. (role: current checkout snapshot contributor; confidence: low; commits: fbc611ab4c65; files: src/agents/embedded-agent-subscribe.handlers.lifecycle.ts, src/agents/embedded-agent-runner/run/payloads.ts, src/agents/embedded-agent-helpers/errors.ts)
vignesh07: History shows the older pi lifecycle error logging path was introduced in commit 68b92e8 before the current embedded-agent path was carried forward. (role: introduced adjacent lifecycle error behavior; confidence: medium; commits: 68b92e80f72d; files: src/agents/pi-embedded-subscribe.handlers.lifecycle.ts)
vincentkoc: History ties Vincent Koc to user-facing assistant error formatting and the shared formatter seam that this PR wraps at the delivery boundary. (role: formatter seam contributor; confidence: medium; commits: 478af8170689, 397b0d85f571; files: src/agents/pi-embedded-helpers/errors.ts, src/shared/assistant-error-format.ts)
jeffrey701: Commit 01ef169 added current-main HTTP 401 provider-error sanitization in the same formatter and tests that this PR must preserve on merge. (role: recent adjacent formatter contributor; confidence: high; commits: 01ef169004bc; files: src/agents/embedded-agent-helpers/errors.ts, src/agents/embedded-agent-helpers.formatassistanterrortext.test.ts)

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

jason-allen-oneal · 2026-05-31T18:21:02Z

@clawsweeper re-review

clawsweeper · 2026-05-31T18:21:05Z

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

steipete · 2026-05-31T22:10:25Z

Maintainer pass complete. I pushed follow-up commits on top of this PR to keep user-facing channel delivery from exposing raw or raw-derived provider error text, including structured, escaped, and aborted-turn cases, while preserving safe schema/rate-limit guidance.

Proof on exact head b46e197f624cac968622163d9818187195c03a0d:

OPENCLAW_VITEST_MAX_WORKERS=1 node scripts/run-vitest.mjs run src/agents/embedded-agent-helpers.formatassistanterrortext.test.ts src/agents/embedded-agent-runner/run/payloads.errors.test.ts src/agents/embedded-agent-subscribe.handlers.lifecycle.test.ts src/agents/embedded-agent-helpers/errors.test.ts passed: 2 Vitest shards, 169 tests.
oxfmt --check passed on the 7 touched files.
git diff --check origin/main...HEAD passed.
autoreview --mode branch --base origin/main finished clean: no accepted/actionable findings.
GitHub exact-head checks are green: 134 success, 26 skipped, 1 neutral, 0 blockers.

Known proof gap: I did not run a new live Telegram/Slack send; the PR now has direct lifecycle/payload regression coverage plus the existing real-behavior proof path.

Copilot AI review requested due to automatic review settings May 31, 2026 12:16

openclaw-barnacle Bot added agents Agent runtime and tooling size: S triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 31, 2026

openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 31, 2026

clawsweeper Bot had a problem deploying to qa-live-shared May 31, 2026 12:29 Failure

jason-allen-oneal force-pushed the fix/69737-suppress-raw-error-message branch from dfb619e to 3b293ab Compare May 31, 2026 12:33

openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 31, 2026

clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 31, 2026

jason-allen-oneal force-pushed the fix/69737-suppress-raw-error-message branch from 3b293ab to f4ec3b0 Compare May 31, 2026 12:41

openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 31, 2026

clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 31, 2026

jason-allen-oneal force-pushed the fix/69737-suppress-raw-error-message branch from f4ec3b0 to 3733fe9 Compare May 31, 2026 12:56

openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 31, 2026

jason-allen-oneal force-pushed the fix/69737-suppress-raw-error-message branch from 3733fe9 to ceef03f Compare May 31, 2026 13:19

openclaw-barnacle Bot added triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. and removed proof: sufficient ClawSweeper judged the real behavior proof convincing. labels May 31, 2026

clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 31, 2026

steipete self-assigned this May 31, 2026

openclaw-barnacle Bot added size: M and removed size: S labels May 31, 2026

jason-allen-oneal and others added 8 commits May 31, 2026 23:05

fix: suppress raw provider errors in channel delivery

3ea868b

fix: preserve aborted payload behavior

c6f6470

fix: suppress structured provider error text

9d6eb22

fix: handle escaped provider error text

e38210c

fix: preserve structured rate limit guidance

c994525

fix: keep http rate limit guidance

515c6d5

fix: suppress aborted provider errors

d6e0f46

fix: keep schema rejection guidance

b46e197

steipete force-pushed the fix/69737-suppress-raw-error-message branch from d68fd4d to b46e197 Compare May 31, 2026 22:06

openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 31, 2026

steipete merged commit 01f6ad6 into openclaw:main May 31, 2026
161 checks passed

steipete mentioned this pull request May 31, 2026

Delivery layer: posts raw errorMessage verbatim when assistant message has stopReason=error #69737

Closed

clawsweeper Bot mentioned this pull request Jun 10, 2026

Feature: Graceful error handling for LLM failures — never expose raw errors to users #39612

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: suppress raw provider errors in channel delivery#88610

fix: suppress raw provider errors in channel delivery#88610
steipete merged 8 commits into
openclaw:mainfrom
jason-allen-oneal:fix/69737-suppress-raw-error-message

jason-allen-oneal commented May 31, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented May 31, 2026 •

edited

Loading

Uh oh!

jason-allen-oneal commented May 31, 2026

Uh oh!

clawsweeper Bot commented May 31, 2026 •

edited

Loading

Uh oh!

steipete commented May 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

jason-allen-oneal commented May 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Evidence

Real behavior proof

Testing

Uh oh!

clawsweeper Bot commented May 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jason-allen-oneal commented May 31, 2026

Uh oh!

clawsweeper Bot commented May 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

steipete commented May 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jason-allen-oneal commented May 31, 2026 •

edited

Loading

clawsweeper Bot commented May 31, 2026 •

edited

Loading

clawsweeper Bot commented May 31, 2026 •

edited

Loading