Fix transcript image redaction#91529
Conversation
|
Codex review: needs maintainer review before merge. Reviewed June 8, 2026, 10:08 PM ET / 02:08 UTC. Summary PR surface: Source +339, Tests +420. Total +759 across 8 files. Reproducibility: yes. source-level reproduction is high confidence: current main recursively redacts all transcript strings and the default AKID-style pattern can match normal image base64 bytes. I did not run tests because this review was required to keep the checkout read-only. Review metrics: none identified. Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Risk before merge
Maintainer options:
Next step before merge
Security Review detailsBest possible solution: Land this focused fix after a maintainer/security reviewer explicitly accepts the validated opaque-image redaction boundary and the storage-vs-transport split. Do we have a high-confidence way to reproduce the issue? Yes, source-level reproduction is high confidence: current main recursively redacts all transcript strings and the default AKID-style pattern can match normal image base64 bytes. I did not run tests because this review was required to keep the checkout read-only. Is this the best way to solve the issue? Yes, with maintainer/security approval. Validating opaque image payloads in media-core before skipping text substitution is narrower than a MIME-only skip, and repair-only would not prevent future poisoned transcripts. AGENTS.md: found and applied where relevant. Codex review notes: model gpt-5.5, reasoning high; reviewed against 5097749de32c. Label changesLabel justifications:
Evidence reviewedPR surface: Source +339, Tests +420. Total +759 across 8 files. View PR surface stats
Security concerns:
Acceptance criteria:
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
|
@clawsweeper re-review |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
|
Looks like this bug was introduced by my merged PR. Did a quick sanity check and everything seems fine. |
* fix transcript image redaction * fix image redaction type predicate * tighten transcript image redaction boundary (cherry picked from commit aef1fad)
* fix transcript image redaction * fix image redaction type predicate * tighten transcript image redaction boundary
Summary
Fixes #90760.
Security Boundary
This intentionally treats validated image bytes as opaque media data, not text to run through secret-pattern substitution. The exemption is bounded by image field shape, strict base64 validation, bounded prefix decoding, and magic-byte/signature sniffing. Adjacent text still goes through normal redaction, and fake image-shaped payloads containing secret-looking strings are still redacted or repaired.
The PR avoids a MIME-only bypass: image claims are accepted only when bytes match a supported image signature. It also avoids widening provider transport behavior by using a separate storage sanitizer from the existing transport sanitizer.
Verification
.agents/skills/autoreview/scripts/autoreview --mode localaws, runrun_2da949704798, leasecbx_f1c73cdaa4d8, machinec7a.8xlarge, exit0node scripts/run-vitest.mjs packages/media-core/src/inline-image-data-url.test.ts src/agents/transcript-redact.test.ts src/config/sessions/transcript-append-redact.test.ts src/agents/session-file-repair.test.ts src/agents/responses-image-payload-sanitizer.test.tsaws, runrun_46fd11090674, leasecbx_a2733bb606fb, exit0persistedIncludesEllipsis:false,nonAsciiCount:0stopReason:"stop"1image blockstopReason:"stop"