Skip to content

Delivery layer: posts raw errorMessage verbatim when assistant message has stopReason=error #69737

@alexisperumal

Description

@alexisperumal

Summary

OpenClaw's chat delivery layer (Slack verified; Telegram suspected same code path) posts the raw upstream API errorMessage verbatim to the user-facing channel when an assistant message has stopReason=error and the provider populates errorMessage with raw upstream error text (e.g., an OpenAI 500 with a request ID and a help.openai.com URL).

Update (2026-04-29): This issue has been scoped down to Bug 1 only. The originally bundled multi-text concatenation behavior has been split into #74674 per @martingarramon's recommendation — different file, different blast radius, different reviewer pool. The historical text below is preserved for trace continuity; current scope is Bug 1.

Environment

  • OpenClaw: 2026.4.14 (originally observed); upstream/main confirmed by community trace at cfda375bb6 (2026-04-22) and 7ddd815e469e (2026-04-28)
  • Provider: openai-codex (using api: openai-codex-responses)
  • Model: gpt-5.4
  • Delivery channel verified: Slack (same code path likely for Telegram; untested)

Evidence

A scheduled cron fire hit an OpenAI 500. The assistant message written to the session store had:

stopReason: error
content: []
errorMessage: "The server had an error processing your request. Sorry about that!
               You can retry your request, or contact us through our help center at
               help.openai.com if you keep seeing this error. (Please include the
               request ID <uuid> in your email.)"

The delivery layer posted the errorMessage text verbatim to the end user's Slack DM.

Source trace (per community trace, 2026-04-22)

  • Delivery-side fallback chain at src/agents/pi-embedded-subscribe.handlers.lifecycle.ts:76 (line :65 per Codex re-check on 7ddd815e469e):

    const errorText = (friendlyError || lastAssistant.errorMessage || "LLM request failed.").trim();

    Three levels: (a) formatAssistantErrorText(lastAssistant, …) — the sanitization hook at src/agents/pi-embedded-helpers/errors.ts:925 (line :1142 per re-check), (b) raw lastAssistant.errorMessage, (c) "LLM request failed.".

  • formatAssistantErrorText only sanitizes a specific list of error patterns — unknown tool, disk space, auth_refresh, refresh_contention, refresh_timeout. A generic OpenAI 5xx with a request ID and help.openai.com URL matches none of those, so it falls through to position (b) and the raw upstream text reaches the user-facing channel.

  • Payload builder at src/agents/pi-embedded-runner/run/payloads.ts:210 pushes errorText with isError: true for terminal assistant errors, so this is on the user-facing delivery path.

Expected behavior

When stopReason=error and the provider's error doesn't match a known friendly classification:

  • Suppress the message entirely (admin-log server-side only), OR
  • Substitute a generic user-facing fallback (e.g., "Temporary issue — will retry shortly.")
  • Never post the raw errorMessage verbatim to end users

Proposed fix — minimal-blast-radius

Per @martingarramon's recommendation in the trace comment:

Option A (smallest): Remove position (b) from the fallback chain — let unmatched errors land on "LLM request failed.". Preserves the existing structured/HTTP-shaped/billing/timeout/schema/transport friendly cases handled earlier in formatAssistantErrorText.

Option B: Keep position (b) but route it through buildTextObservationFields(...).textPreview (already used for the admin log at lifecycle.ts:80-83) so at least URL/PII scrubbing applies before user delivery.

Test caveat

Per Codex re-check (2026-04-28): src/agents/pi-embedded-subscribe.handlers.lifecycle.test.ts:145 currently asserts the sanitized-raw behavior on a secret-like unclassified error (expects x-api-key: *** in the emitted lifecycle error text). Any fix that replaces raw error text with a generic fallback or suppression needs that test updated alongside.

Acceptance

  • When OpenAI returns an unclassified API error (request ID + help.openai.com URL), the user-facing channel never sees the raw errorMessage text.
  • Existing friendly-classification cases (structured JSON, HTTP-shaped, billing, timeout, schema, transport) continue to be sanitized as today.
  • Lifecycle regression test updated to reflect the new fallback behavior.

Retry-with-backoff

A separate / larger feature — not requested here; this issue focuses on the delivery-layer fallback only.

Offer

Happy to test a PR against our production deployment and provide additional evidence (scrubbed of user data) if useful.

Related

  • Bug 2 (multi-text concatenation) split out to #74674.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Normal backlog priority with limited blast radius.clawsweeper:needs-security-reviewClawSweeper marked this issue as needing security-sensitive review.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.impact:securitySecurity boundary, credential, authz, sandbox, or sensitive-data risk.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions