Skip to content

fix(agents): ensure inline image attachments use ASCII base64 (Fixes #86984)#88112

Open
alkor2000 wants to merge 3 commits into
openclaw:mainfrom
alkor2000:fix-86984-inline-image-base64
Open

fix(agents): ensure inline image attachments use ASCII base64 (Fixes #86984)#88112
alkor2000 wants to merge 3 commits into
openclaw:mainfrom
alkor2000:fix-86984-inline-image-base64

Conversation

@alkor2000

Copy link
Copy Markdown
Contributor

Summary

Inline and replayed-history image attachments handed in by channel plugins can carry a raw latin1/binary byte string in their data field. This value was forwarded untouched into the provider source.base64. The Anthropic Messages API rejects any request whose image.source.base64 contains non-ASCII characters:
messages.337.content.1.image.source.base64: string argument should contain only ASCII characters

Because history images are re-hydrated on every turn, a single poisoned history image block silently breaks every subsequent turn in a long-lived session, and the agent goes silent with isError=true. The same raw passthrough also reaches the OpenAI/Responses image mappers.

Root cause

resolveAgentTurnAttachments (current-turn path) already encodes file buffers via buffer.toString("base64"). But resolveInlineAgentImageAttachments forwarded data untouched:

.map((image) => ({ mediaType: image.mimeType, data: image.data }))

When data holds unencoded latin1 bytes, the downstream provider mapper builds the image source with non-ASCII content and the API rejects the whole turn.

Fix

resolveInlineAgentImageAttachments now routes data through an idempotent ensureAsciiBase64 guard built on the existing canonicalizeBase64 helper (the same one used by tool-images.ts):

  • Valid base64 is always ASCII, so canonicalizeBase64 returns it unchanged and correct payloads pass through untouched.
  • Only non-base64 payloads are re-encoded from their latin1 bytes via Buffer.from(data, "latin1").toString("base64").

The guard is purely defensive and idempotent, so normal images are unaffected.

Real behavior proof

Behavior or issue addressed: Inline/replayed-history image attachments with raw latin1 data produced a non-ASCII provider source.base64, which the Anthropic API rejected, breaking every subsequent turn (issue 86984).

Real environment tested: Local Ubuntu 24.04 (WSL2), Node v24.14.0, exercising the actual resolveInlineAgentImageAttachments entry point on disk.

Exact steps or command run after this patch: ran the command shown in the evidence below against the real source file.

Evidence after fix: I ran node scripts/run-vitest.mjs src/auto-reply/reply/agent-turn-attachments.test.ts directly on this machine. The captured terminal output below is from that real node invocation. To confirm the regression is genuinely detected, the same node command was run with the one-line change temporarily reverted to the original passthrough (BEFORE), then restored (AFTER):
-- BEFORE (data: image.data passthrough), node run on this host
auto-reply agent-turn-attachments.test.ts (4 tests | 1 failed)
x re-encodes raw latin1/binary data into ASCII base64
AssertionError: expected false to be true
at expect(ASCII_ONLY.test(attachment.data)).toBe(true)
Test Files 1 failed (1)
Tests 1 failed | 3 passed (4)
-- AFTER (data: ensureAsciiBase64(image.data)), node run on this host
auto-reply agent-turn-attachments.test.ts (4 tests)
Test Files 1 passed (1)
Tests 4 passed (4)
Duration 315ms

Observed result after fix: the node run reports all 4 cases green. The latin1 payload is re-encoded to valid ASCII base64 and decodes back to the original image bytes; already-valid base64 is returned unchanged (idempotent); non-image and empty attachments are still filtered out.

What was not tested: the same raw-data passthrough also exists on the Anthropic and OpenAI/Responses provider mappers. This change fixes the single inline-attachment chokepoint that all inline/replayed images flow through; a redundant guard at the provider-mapper layer is left out to keep the change minimal and can be added separately if maintainers prefer defense in depth. Live end-to-end reproduction against a real provider API was not performed.

@openclaw-barnacle openclaw-barnacle Bot added size: S proof: supplied External PR includes structured after-fix real behavior proof. labels May 29, 2026
@clawsweeper

clawsweeper Bot commented May 29, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge. Reviewed May 30, 2026, 12:18 AM ET / 04:18 UTC.

Summary
The PR adds ASCII base64 normalization for inline image attachments in src/auto-reply/reply/agent-turn-attachments.ts and a focused regression test file for raw latin1 data, valid base64, and filtering behavior.

PR surface: Source +24, Tests +61. Total +85 across 2 files.

Reproducibility: Do we have a high-confidence way to reproduce the issue? Source-reproducible, but not live-reproduced in this review: current main forwards inline image data into provider base64 payloads, and the linked report plus PR test output show the raw latin1 failure mode.

Review metrics: none identified.

Merge readiness
Overall: 🦪 silver shellfish
Proof: 🦪 silver shellfish
Patch quality: 🐚 platinum hermit
Result: blocked until real behavior proof from a real setup is added.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • [P1] Add redacted real Discord/Anthropic or OpenAI/Responses runtime output showing an inline or replayed image follow-up turn succeeds after this patch.

Proof guidance:

  • [P1] Needs real behavior proof before merge: Only focused before/after Vitest terminal output is supplied; please add redacted real channel/provider runtime output, recording, screenshot with diagnostics, or logs showing the fixed turn succeeds, then update the PR body to trigger re-review or ask for @clawsweeper re-review.

Risk before merge

  • [P1] The contributor proof is still a focused local Vitest run, not redacted real Discord/Anthropic or OpenAI/Responses runtime output showing an inline or replayed image follow-up turn succeeds after the patch.

Maintainer options:

  1. Decide the mitigation before merge
    Land the chokepoint normalization after the PR body includes redacted real channel/provider proof that an affected image follow-up turn succeeds; provider-layer defense in depth can remain a separate maintainer choice.
  2. Pause or close
    Do not merge this PR until maintainers decide whether the risk is worth taking.

Next step before merge

  • [P1] Contributor action is needed for real behavior proof; there is no narrow ClawSweeper code repair to perform on the branch right now.

Security
Cleared: The diff only normalizes in-memory image attachment strings and adds tests; it does not change dependencies, workflows, credentials, package metadata, or other supply-chain surfaces.

Review details

Best possible solution:

Land the chokepoint normalization after the PR body includes redacted real channel/provider proof that an affected image follow-up turn succeeds; provider-layer defense in depth can remain a separate maintainer choice.

Do we have a high-confidence way to reproduce the issue?

Do we have a high-confidence way to reproduce the issue? Source-reproducible, but not live-reproduced in this review: current main forwards inline image data into provider base64 payloads, and the linked report plus PR test output show the raw latin1 failure mode.

Is this the best way to solve the issue?

Is this the best way to solve the issue? Yes for the code shape: normalizing at the inline attachment boundary is the narrow shared fix before Anthropic and OpenAI mapping; the remaining blocker is real runtime proof, not a different implementation direction.

AGENTS.md: found and applied where relevant.

Codex review notes: reasoning high; reviewed against a17487bc9f69.

Label changes

Label justifications:

  • P1: The PR addresses a broken image-follow-up agent workflow that can make long-lived channel sessions go silent after provider rejection.
  • rating: 🦪 silver shellfish: Overall readiness is 🦪 silver shellfish; proof is 🦪 silver shellfish and patch quality is 🐚 platinum hermit.
  • status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: Only focused before/after Vitest terminal output is supplied; please add redacted real channel/provider runtime output, recording, screenshot with diagnostics, or logs showing the fixed turn succeeds, then update the PR body to trigger re-review or ask for @clawsweeper re-review.
Evidence reviewed

PR surface:

Source +24, Tests +61. Total +85 across 2 files.

View PR surface stats
Area Files Added Removed Net
Source 1 27 3 +24
Tests 1 61 0 +61
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 2 88 3 +85

What I checked:

  • Repository policy read: Read the full root AGENTS.md; its ClawSweeper PR policy and real-behavior-proof requirement affected this review. (AGENTS.md:1, a17487bc9f69)
  • Current main inline passthrough: Current main maps inline images with data: image.data and only filters image media type and non-empty data, so malformed inline data can reach provider mappers unchanged. (src/auto-reply/reply/agent-turn-attachments.ts:162, a17487bc9f69)
  • Provider base64 contract: The shared image content contract says image data is base64 encoded, and Anthropic/OpenAI mapper code directly embeds that field into provider base64 image payloads. (src/llm/types.ts:218, a17487bc9f69)
  • Provider passthrough surfaces: Anthropic builds source: { type: "base64", ... data: block.data }, and OpenAI Responses builds data:${item.mimeType};base64,${item.data} from the same image data field. (src/agents/anthropic-transport-stream.ts:283, a17487bc9f69)
  • PR patch shape: The PR imports canonicalizeBase64, adds ensureAsciiBase64, filters before mapping, and re-encodes non-canonical image data from latin1 bytes; tests cover the raw latin1 regression and existing filtering behavior. (src/auto-reply/reply/agent-turn-attachments.ts:160, 04f38d39e68b)
  • Linked bug report: The linked issue reports provider rejection for non-ASCII image.source.base64 after replayed/history image data reaches the provider as raw latin1/binary text.

Likely related people:

  • steipete: Path history ties the recent inbound history image feature to commit 8859e89, and recent Anthropic/provider work also appears under this handle. (role: feature owner and recent area contributor; confidence: high; commits: 8859e89e075e, 1188aa3b81ef; files: src/auto-reply/reply/agent-turn-attachments.ts, src/agents/anthropic-transport-stream.ts)
  • joshavant: Recent path history shows commit 491ce8b touching image attachment handling for Telegram/Ollama image delivery near the same attachment pipeline. (role: recent image attachment contributor; confidence: medium; commits: 491ce8b7535b; files: src/auto-reply/reply/agent-turn-attachments.ts, src/agents/pi-embedded-runner/run/images.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P1 High-priority user-facing bug, regression, or broken workflow. labels May 29, 2026
@alkor2000

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

Addressed the [P2] lint finding: replaced the control-character regex in the test with an explicit charCode-based isAsciiOnly helper (no-control-regex now clean). The focused test suite still passes 4/4, and pnpm tsgo:test reports no type errors for the test file. The third failing check (check-additional-extension-bundled) is unrelated to this diff and appears to track an upstream base condition.

@clawsweeper

clawsweeper Bot commented May 29, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@alkor2000

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

Fixed the lint failure: the ASCII predicate no longer spreads a string (the no-string-spread rule is now clean; the oxlint shards pass with 0 errors locally). Test behavior is unchanged.

The two remaining red checks are unrelated to this diff and reflect current upstream base drift, not this PR:

  • check-test-types fails only in src/config/sessions.cache.test.ts (SessionEntry now wraps in a Promise; Skill dropped its body field).
  • checks-node-core-runtime-media-ui fails only in src/wizard/setup.official-plugins.test.ts (plugin onboarding installer).

Neither file is touched by this PR; both fail on current main independently of this change.

@clawsweeper

clawsweeper Bot commented May 29, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper clawsweeper Bot added rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. and removed rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. labels May 29, 2026
@alkor2000

Copy link
Copy Markdown
Contributor Author

Status update after the lint fix:

check-lint, check-test-types, check-import-cycles, build-artifacts, the security scans, and the Real behavior proof gate all pass now, and the diff merges cleanly with no conflicts.

This PR's diff is exactly two files (src/auto-reply/reply/agent-turn-attachments.ts and its test). The two remaining red checks are unrelated upstream base conditions:

  1. check-guards fails on 'npm-shrinkwrap.json is stale'. This PR changes no dependencies and touches no lockfile; regenerating the shrinkwrap here would add a large unrelated lockfile diff, so I've left it to the base.

  2. checks-node-core-runtime-media-ui fails only in src/wizard/setup.official-plugins.test.ts:107 (plugin onboarding installer), which is unrelated to image attachments and not modified here.

Neither file is in this PR's diff; both fail on current main independently of this change.

alkor2000 added 3 commits May 30, 2026 12:07
Inline and replayed-history image attachments forwarded by channel plugins
could carry raw latin1/binary byte strings in their data field, which were
passed untouched into the provider source.base64. Anthropic rejects any
request whose image.source.base64 contains non-ASCII characters, so a single
poisoned history image block silently broke every subsequent turn in a
long-lived session (the current-turn path already encoded via toString base64).

resolveInlineAgentImageAttachments now routes data through an idempotent
ensureAsciiBase64 guard built on the existing canonicalizeBase64 helper: valid
base64 (always ASCII) passes through unchanged, and only non-base64 payloads
are re-encoded from their latin1 bytes.

Fixes openclaw#86984
Replace the control-character regex (no-control-regex lint error) with an
explicit charCode-based isAsciiOnly helper. Test behavior is unchanged.
Replace the spread-based helper with an index loop to satisfy the lint
rule against spreading strings (which can split multi-byte characters).
Test behavior is unchanged.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P1 High-priority user-facing bug, regression, or broken workflow. proof: supplied External PR includes structured after-fix real behavior proof. rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. size: S status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant