Skip to content

Secret/credential masking corrupts image base64 with ellipsis marker (U+2026), permanently poisoning the session #90760

@devinkuhn

Description

@devinkuhn

Summary

The secret/credential-masking layer runs its redaction pass over image base64 payloads and replaces matched substrings with the masking marker (U+2026). This corrupts the base64 irrecoverably and injects a non-ASCII character into image.source.base64. On the next turn (and every turn after), the provider mapper replays the poisoned image block from history and the Anthropic API rejects the entire request:

LLM request rejected: messages.N.content.0.image.source.base64: string argument should contain only ASCII characters

The session is then permanently stuck — every subsequent user message fails with the same error because the corrupted frame keeps replaying from history. Plain-text turns only recover once the replay window scrolls past the bad frame.

This is distinct from #86984

#86984 / PR #88112 describe a different corruption path with the same symptom/error string:

#86984 This issue
Cause Replayed image data forwarded as raw latin1/binary bytes, never base64-encoded Base64 was valid ASCII, then the secret-masker overwrote a chunk with (U+2026)
Evidence Many non-ASCII bytes throughout (unencoded binary) Exactly 1 non-ASCII char in the whole payload — the masking ellipsis
Recoverable? Yes — re-encode latin1 → base64 (ensureAsciiBase64) No — original bytes are destroyed by the redaction; re-canonicalizing yields a still-broken image

ensureAsciiBase64 (PR #88112) would not fix this variant: once the masker replaces base64 bytes with , the data is gone. Re-encoding the corrupted string produces an invalid image.

Root cause

The credential/secret-redaction layer scans message content for secret-like patterns and replaces matches with . It does not exclude image.source.base64 / image data fields. A long base64 blob can contain a substring that matches a secret heuristic, so the masker rewrites part of the base64 with the ellipsis marker, corrupting it.

Evidence (production, 2026.5.20)

Real session on our deployment:

  • One image block (IMG_6842.png, ~80,609-char base64) ended up with exactly one non-ASCII character: \u2026 (U+2026, the masking marker).
  • Located at messages.428.content.0.image.source.base64.
  • Every turn after the image was sent failed with messages.428.content.0.image.source.base64: string argument should contain only ASCII characters until we manually stripped the image block from the session JSONL.
  • The single-char-in-80K signature is the giveaway: this is not unencoded binary (which would be non-ASCII throughout) — it's valid base64 with one masking marker spliced in.

Expected behavior

The secret/credential-masking pass should skip image base64 payloads entirely (image.source.base64, image block data fields, and any data-URL base64). Redacting inside an opaque base64 blob can never reveal a real secret to a human reader anyway, and it corrupts the payload.

Suggested fix

  • Exclude image/base64/data-URL fields from the secret-redaction scan, OR
  • If a field is detected as base64/data-URL, never substitute the masking marker mid-string (skip the field).

Workaround

Strip the poisoned image block from the session JSONL (the base64 data is unrecoverable). Replacing it with a text placeholder restores the session.

Environment

  • OpenClaw 2026.5.20 (e510042)
  • Provider: Anthropic (claude family) via a custom anthropic-messages provider (Bifrost)
  • Channel: Slack
  • Also expected to reproduce on the OpenAI/Responses path (same data-URL base64 validation)

Cross-ref: #86984 (same error, different corruption path), PR #88112.

Metadata

Metadata

Assignees

Labels

P1High-priority user-facing bug, regression, or broken workflow.clawsweeper:fix-shape-clearClawSweeper found a clear likely implementation shape for this issue.clawsweeper:needs-security-reviewClawSweeper marked this issue as needing security-sensitive review.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.clawsweeper:queueable-fixClawSweeper marked this issue as an existing queue_fix_pr work candidate.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.impact:message-lossChannel message delivery can be lost, duplicated, or misrouted.impact:securitySecurity boundary, credential, authz, sandbox, or sensitive-data risk.impact:session-stateSession, memory, transcript, context, or agent state can drift or corrupt.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions