Skip to content

fix(security): anchor MIME sanitization regex and block fullwidth bypass (#9791, #9795)#10257

Closed
nu-gui wants to merge 1 commit into
openclaw:mainfrom
nu-gui:fix/9791-mime-sanitization-bypass
Closed

fix(security): anchor MIME sanitization regex and block fullwidth bypass (#9791, #9795)#10257
nu-gui wants to merge 1 commit into
openclaw:mainfrom
nu-gui:fix/9791-mime-sanitization-bypass

Conversation

@nu-gui

@nu-gui nu-gui commented Feb 6, 2026

Copy link
Copy Markdown

Summary

  • Anchor the MIME type sanitization regex with $ to reject trailing content after a valid type/subtype pair
  • Add NFKC Unicode normalization before validation to prevent fullwidth character bypasses (e.g., audio/mpeg)
  • Apply the same normalization to normalizeMimeType() in src/media/input-files.ts

Fixes #9791, #9795

Test plan

  • New tests for sanitizeMimeType() covering: standard types, fullwidth Unicode, trailing content, invalid values
  • New tests for normalizeMimeType() covering: charset stripping, fullwidth Unicode normalization
  • All 11 new tests pass
  • pnpm check passes (0 warnings, 0 errors)

Greptile Overview

Greptile Summary

  • Updates MIME sanitization in src/media-understanding/apply.ts to NFKC-normalize before validation and anchors the regex to reject trailing content.
  • Exports sanitizeMimeType() and adds targeted Vitest coverage for standard types, whitespace/lowercasing, fullwidth Unicode normalization, and trailing-content rejection.
  • Updates normalizeMimeType() in src/media/input-files.ts to also apply NFKC normalization, with tests covering parameter/charset stripping and fullwidth normalization.
  • Change fits into the media ingestion pipeline by hardening MIME handling before allowing/denying extraction and before embedding MIME values into generated <file ...> blocks.

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk.
  • Changes are tightly scoped to MIME normalization/validation, include explicit regression tests for the intended security fixes (trailing-content rejection and fullwidth bypass prevention), and do not alter unrelated control flow in the media pipeline.
  • No files require special attention

…ass (#9791, #9795)

Apply NFKC normalization before MIME type validation to prevent
fullwidth Unicode homoglyphs from bypassing sanitization. Anchor
the regex at both ends to reject trailing content.

Fixes #9791, #9795
@openclaw-barnacle

Copy link
Copy Markdown

This pull request has been automatically marked as stale due to inactivity.
Please add updates or it will be closed.

@openclaw-barnacle openclaw-barnacle Bot added stale Marked as stale due to inactivity and removed stale Marked as stale due to inactivity labels Feb 21, 2026
@steipete

Copy link
Copy Markdown
Contributor

Closing as AI-assisted stale-fix triage.

Linked issue #9791 ("[Security] Fullwidth character markers bypass sanitization") is currently closed and was closed on 2026-02-15T05:06:14Z with state reason completed.
Given that issue is closed, this fix PR is no longer needed in the active queue and is being closed as stale.

If this specific implementation is still needed on current main, please reopen #10257 (or open a new focused fix PR) and reference #9791 for fast re-triage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Security] Fullwidth character markers bypass sanitization

2 participants