Skip to content

Fix transcript image redaction#91529

Merged
joshavant merged 3 commits into
mainfrom
fix/90760-transcript-image-redaction
Jun 9, 2026
Merged

Fix transcript image redaction#91529
joshavant merged 3 commits into
mainfrom
fix/90760-transcript-image-redaction

Conversation

@joshavant

@joshavant joshavant commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes #90760.

  • preserve validated opaque image payloads during transcript redaction so default secret patterns cannot rewrite base64 bytes with a Unicode ellipsis
  • keep redaction active for adjacent text and fake image-shaped secret payloads
  • repair already-poisoned session image blocks by replacing unrecoverable corrupted image payloads with replayable fallback text
  • keep transcript storage and provider transport boundaries separate: storage can preserve validated image formats such as BMP/HEIC/HEIF, while transport sanitizer remains limited to provider-safe browser image data URLs

Security Boundary

This intentionally treats validated image bytes as opaque media data, not text to run through secret-pattern substitution. The exemption is bounded by image field shape, strict base64 validation, bounded prefix decoding, and magic-byte/signature sniffing. Adjacent text still goes through normal redaction, and fake image-shaped payloads containing secret-looking strings are still redacted or repaired.

The PR avoids a MIME-only bypass: image claims are accepted only when bytes match a supported image signature. It also avoids widening provider transport behavior by using a separate storage sanitizer from the existing transport sanitizer.

Verification

  • .agents/skills/autoreview/scripts/autoreview --mode local
    • clean: no accepted/actionable findings
  • AWS Crabbox fix/regression proof after the clean autoreview: provider aws, run run_2da949704798, lease cbx_f1c73cdaa4d8, machine c7a.8xlarge, exit 0
    • node scripts/run-vitest.mjs packages/media-core/src/inline-image-data-url.test.ts src/agents/transcript-redact.test.ts src/config/sessions/transcript-append-redact.test.ts src/agents/session-file-repair.test.ts src/agents/responses-image-payload-sanitizer.test.ts
    • passed 3 Vitest shards: 85 tests
    • explicit proof script confirmed valid image payload preservation, adjacent secret redaction, storage-vs-transport data URL boundary, and corrupted replay repair
  • Earlier AWS Crabbox live behavior proof: provider aws, run run_46fd11090674, lease cbx_a2733bb606fb, exit 0
    • redaction-trigger image persisted with persistedIncludesEllipsis:false, nonAsciiCount:0
    • valid image replay through live Anthropic transport returned stopReason:"stop"
    • corrupted transcript repair removed 1 image block
    • repaired transcript replay through live Anthropic transport returned stopReason:"stop"

@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: M maintainer Maintainer-authored PR labels Jun 9, 2026
@clawsweeper

clawsweeper Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge. Reviewed June 8, 2026, 10:08 PM ET / 02:08 UTC.

Summary
The branch updates media-core image validation, transcript redaction, session-file repair, and regression tests so validated image payloads survive redaction and already-corrupted image blocks can be replaced for replay.

PR surface: Source +339, Tests +420. Total +759 across 8 files.

Reproducibility: yes. source-level reproduction is high confidence: current main recursively redacts all transcript strings and the default AKID-style pattern can match normal image base64 bytes. I did not run tests because this review was required to keep the checkout read-only.

Review metrics: none identified.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🦞 diamond lobster
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • [P2] Get explicit maintainer/security acceptance of the validated-image redaction boundary before merge.

Risk before merge

  • [P1] Merging this PR intentionally treats validated image bytes and image data URLs as opaque transcript data that bypasses text secret-pattern substitution, so maintainers need to accept media validation as the storage redaction boundary.

Maintainer options:

  1. Approve the validation boundary (recommended)
    A maintainer/security reviewer can accept validated image bytes as opaque transcript data because adjacent text, fake payloads, transport filtering, and repair behavior are covered.
  2. Tighten before merge
    If the boundary is too broad, require narrower shape constraints or additional validation tests before allowing image payloads to bypass text substitution.
  3. Pause for image-safe scanning
    If opaque media must still be inspected for embedded secrets, pause this branch and design a non-corrupting scanner instead of substituting inside base64 bytes.

Next step before merge

  • [P2] The protected maintainer label and security-boundary change require human maintainer/security approval; there is no narrow automated repair to queue.

Security
Needs attention: No supply-chain issue was found, but the redaction-boundary change needs maintainer/security acceptance before merge.

Review details

Best possible solution:

Land this focused fix after a maintainer/security reviewer explicitly accepts the validated opaque-image redaction boundary and the storage-vs-transport split.

Do we have a high-confidence way to reproduce the issue?

Yes, source-level reproduction is high confidence: current main recursively redacts all transcript strings and the default AKID-style pattern can match normal image base64 bytes. I did not run tests because this review was required to keep the checkout read-only.

Is this the best way to solve the issue?

Yes, with maintainer/security approval. Validating opaque image payloads in media-core before skipping text substitution is narrower than a MIME-only skip, and repair-only would not prevent future poisoned transcripts.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 5097749de32c.

Label changes

Label justifications:

  • P1: The linked issue describes a real session-poisoning failure where subsequent agent turns can fail until the corrupted image block is removed.
  • merge-risk: 🚨 security-boundary: The PR deliberately changes which transcript fields are exempt from text secret redaction and relies on validated image payload detection as the boundary.
  • rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
  • status: ⏳ waiting on author: ClawSweeper has contributor-facing work open and is waiting for author action. Sufficient (live_output): The PR body includes after-fix AWS Crabbox live output showing valid image replay and repaired corrupted-transcript replay through live Anthropic transport, plus a focused regression run.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes after-fix AWS Crabbox live output showing valid image replay and repaired corrupted-transcript replay through live Anthropic transport, plus a focused regression run.
Evidence reviewed

PR surface:

Source +339, Tests +420. Total +759 across 8 files.

View PR surface stats
Area Files Added Removed Net
Source 3 357 18 +339
Tests 5 424 4 +420
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 8 781 22 +759

Security concerns:

  • [medium] Approve the opaque image redaction boundary — src/agents/transcript-redact.ts:253
    Validated image payloads and image data URLs now bypass text secret replacement, which appears necessary to avoid corrupting media bytes but should be an explicit security decision.
    Confidence: 0.84

Acceptance criteria:

  • [P1] node scripts/run-vitest.mjs packages/media-core/src/inline-image-data-url.test.ts src/agents/transcript-redact.test.ts src/config/sessions/transcript-append-redact.test.ts src/agents/session-file-repair.test.ts src/agents/responses-image-payload-sanitizer.test.ts.

What I checked:

  • Root policy applied: The full root policy was read and applied; it treats session state, message delivery, and security-sensitive redaction boundaries as maintainer-review surfaces. (AGENTS.md:1, 5097749de32c)
  • Scoped agents policy checked: The scoped agents guide was read; it did not add a contradictory rule for this PR, but it reinforced reading the runtime path rather than diff-only review. (src/agents/AGENTS.md:1, 5097749de32c)
  • Current main source-repro path: Current main recursively redacts every string field in transcript payloads, so image data fields can be sent through the same string redactor before transcript persistence. (src/agents/transcript-redact.ts:57, 5097749de32c)
  • Default pattern can hit base64-looking bytes: The default redaction patterns include token-like prefixes such as AKID, matching the linked report's base64 substring corruption mode. (src/logging/redact.ts:53, 5097749de32c)
  • PR redaction boundary: The PR validates image base64/data URLs through media-core before preserving opaque image payload fields, while adjacent fields still recurse through redaction. (src/agents/transcript-redact.ts:81, 8734194d23ea)
  • PR repair path: The PR detects corrupted persisted image blocks and rewrites them to a text fallback so transcript replay can continue. (src/agents/session-file-repair.ts:121, 8734194d23ea)

Likely related people:

  • hxy91819: The inline transcript redaction behavior this PR refines was introduced in commit faaa7ef with hxy91819 credited as reviewer/co-author, and hxy91819 also commented that the bug appears to trace to that merged PR. (role: prior redaction reviewer and co-author; confidence: high; commits: faaa7efef050; files: src/agents/transcript-redact.ts, src/config/sessions/transcript-append.ts)
  • joshavant: Beyond authoring this PR, prior merged history shows repeated work on secrets, SecretRef, runtime config, and session-adjacent safety paths. (role: recent secrets/session contributor; confidence: medium; commits: 1769fb2aa1d6, 788f56f30fd9, 59bc3c66300b; files: src/agents, src/config/sessions, packages/media-core)
  • Vincent Koc: History around the inline image sanitizer and Anthropic transport surfaces includes Vincent Koc commits, which makes this relevant for the storage-vs-transport validation boundary. (role: media-core and provider-runtime adjacent contributor; confidence: medium; commits: 067496b12934, ea4265a82063, 2e08f0f4221f; files: packages/media-core/src/inline-image-data-url.ts, src/agents/anthropic-transport-stream.ts)
  • Justin: Commit 0da6de6 introduced the malformed tool-call/session-file repair area that this PR extends for corrupted image blocks. (role: session repair introducer; confidence: medium; commits: 0da6de6624a9; files: src/agents/session-file-repair.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. P1 High-priority user-facing bug, regression, or broken workflow. merge-risk: 🚨 security-boundary 🚨 May affect sandboxing, authorization, credentials, or sensitive data. labels Jun 9, 2026
@clawsweeper clawsweeper Bot added status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. and removed status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. labels Jun 9, 2026
@joshavant

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@hxy91819

hxy91819 commented Jun 9, 2026

Copy link
Copy Markdown
Member

Looks like this bug was introduced by my merged PR. Did a quick sanity check and everything seems fine.

@joshavant joshavant merged commit aef1fad into main Jun 9, 2026
174 of 176 checks passed
@joshavant joshavant deleted the fix/90760-transcript-image-redaction branch June 9, 2026 02:23
vincentkoc pushed a commit that referenced this pull request Jun 9, 2026
* fix transcript image redaction

* fix image redaction type predicate

* tighten transcript image redaction boundary

(cherry picked from commit aef1fad)
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request Jun 9, 2026
* fix transcript image redaction

* fix image redaction type predicate

* tighten transcript image redaction boundary
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling maintainer Maintainer-authored PR merge-risk: 🚨 security-boundary 🚨 May affect sandboxing, authorization, credentials, or sensitive data. P1 High-priority user-facing bug, regression, or broken workflow. proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. size: L status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Secret/credential masking corrupts image base64 with ellipsis marker (U+2026), permanently poisoning the session

2 participants