security(media): anchor sanitizeMimeType regex to reject malformed input#68456
security(media): anchor sanitizeMimeType regex to reject malformed input#68456bluesky6868 wants to merge 1 commit into
Conversation
Greptile SummaryAdds an end anchor ( Confidence Score: 5/5This PR is safe to merge; it tightens an input-validation check without altering any callsite or allowlist semantics. The regex change is minimal and correct: end-anchoring stops prefix truncation, No files require special attention. Reviews (1): Last reviewed commit: "security(media): anchor sanitizeMimeType..." | Re-trigger Greptile |
|
Related work from PRtags group Title: Open PR duplicate: sanitizeMimeType regex anchoring (#9795)
|
|
Closing this as duplicate or superseded after Codex automated review. PR #68456 should close as duplicate/superseded. Current main still has the unanchored sanitizer, so the #9795 hardening is not implemented yet, but this PR duplicates the same remaining work already tracked by open PR #68225 and also overlaps older open PR #61016. Best possible solution: Close #68456 as duplicate and consolidate the #9795 sanitizer hardening in one existing open implementation path, preferably the directly PRtags-linked #68225 because it carries focused regression tests, while keeping #9795 open until a selected fix lands. What I checked:
So I’m closing this here and keeping the remaining discussion on the canonical linked item. Codex Review notes: model gpt-5.5, reasoning high; reviewed against d54d2d6b9b8a. |
Summary
sanitizeMimeTypeinsrc/media-understanding/apply.tsmatches MIME types with an unanchored regex, so malformed input silently truncates to a valid-looking prefix. Example:"image/png junk"→"image/png";"image/png..etc/passwd"→"image/png..etc"(the char class includes.,-,_).allowedMimesatapply.ts:427. Malformed input can coerce to a known-good MIME and pass the allowlist, defeating the intent of strict validation at the media-ingest boundary. Maintainers labeled thissecurity.(?:;.*)?$). Export the helper for focused unit tests.Change Type
Scope
Linked Issue/PR
Root Cause
/^([a-z0-9!#$&^_.+-]+\/[a-z0-9!#$&^_.+-]+)/had no end anchor. On malformed input, the first-match-wins behavior captured the leading valid-looking substring and discarded the rest, producing a MIME string that the downstream allowlist could accept.normalizeMimeType(src/media/mime.tsstrips;and lowercases), so valid inputs are unaffected. Only malformed inputs change behavior.Regression Test Plan
src/media-understanding/apply.test.ts(newdescribe("sanitizeMimeType")block)"image/png junk","image/png\nextra","image/png/../etc/passwd","image/png/evil"; still accepts plain MIMEs and standard; paramparameters; still lowercases mixed case; still returnsundefinedon empty/whitespace/missing input.User-visible / Behavior Changes
Malformed MIME hints that previously silently truncated now produce
undefined, which the existing code path atapply.ts:412-416already handles by skipping the attachment with a"file attachment skipped (unknown mime)"verbose-log line. No config change required; no user-facing surface touched.Security Impact
Verification
Gates run locally on macOS Darwin 24.6.0:
pnpm test src/media-understanding/apply.test.ts -t sanitizeMimeType— 7 passedpnpm test src/media-understanding/apply.test.ts— 48 passedpnpm test 'src/media-understanding/**/*.test.ts'— 138 passed across 20 filespnpm check— formatter, oxlint (0 warnings / 0 errors across 11710 files), tsgo, import-cycle + madge checks all greenpnpm build— clean (touches an exported surface, so build is run as a hard gate per CLAUDE.md)Human Verification
apply.ts:405already handlesundefinedreturn..,-,_), vendor-suffix MIMEs (application/vnd.api+json),; charset=parameters.src/media/mime.ts::normalizeMimeTypeorsrc/media/input-files.ts::normalizeMimeType; those are out of scope for [Security] sanitizeMimeType regex should be anchored and case-insensitive #9795.Review Conversations
Compatibility / Migration
Risks and Mitigations
undefinedand the attachment is skipped.; param) and internal hint constants all still match. Existing tests covering real MIMEs still pass (138 tests green).