Media: reject spoofed input_image MIME payloads by vincentkoc · Pull Request #38289 · openclaw/openclaw

vincentkoc · 2026-03-06T19:33:30Z

Summary

Problem: normalizeInputImage trusted caller-declared non-HEIC image MIME types before the allowlist check, so a request could claim image/png while supplying concrete non-image bytes.
Why it matters: that let spoofed input_image payloads bypass the intended image MIME validation boundary on Gateway HTTP APIs.
What changed: image inputs now sniff bytes again before allowlist enforcement, reject concrete non-image detections for declared image/* payloads, and still keep HEIC/HEIF normalization scoped to actual HEIC inputs.
What did NOT change (scope boundary): this does not expand accepted MIME types, relax URL fetching policy, or change the existing HEIC -> JPEG normalization path.

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Closes #
Related Gateway: normalize HEIC input_image sources #38122
Related Gateway: follow up HEIC input image handling #38146

User-visible / Behavior Changes

Gateway input_image requests now reject spoofed non-image payloads even when the request declares an allowed image MIME type.

Security Impact (required)

New permissions/capabilities? (Yes/No) No
Secrets/tokens handling changed? (Yes/No) No
New/changed network calls? (Yes/No) No
Command/tool execution surface changed? (Yes/No) No
Data access scope changed? (Yes/No) No
If any Yes, explain risk + mitigation:

Repro + Verification

Environment

OS: macOS
Runtime/container: Node 22 / pnpm workspace
Model/provider: n/a
Integration/channel (if any): Gateway HTTP APIs
Relevant config (redacted): defaults

Steps

Send an input_image base64 or URL request that declares image/png.
Provide bytes that detect as application/pdf instead of an image.
Observe Gateway validation.

Expected

Spoofed payloads are rejected before provider delivery.

Actual

The previous merged follow-up trusted declared non-HEIC MIME types and could accept spoofed payloads.

Evidence

Attach at least one:

Failing test/log before + passing after
Trace/log snippets
Screenshot/recording
Perf numbers (if relevant)

Human Verification (required)

What you personally verified (not just CI), and how:

Verified scenarios: HEIC base64 normalization, HEIC URL normalization, spoofed base64 image rejection, spoofed URL image rejection, OpenAI chat completions gateway path, OpenResponses parity validation.
Edge cases checked: non-HEIC images keep declared MIME after validation; HEIC image budget tests remain green.
What you did not verify: install smoke locally in Docker.

Compatibility / Migration

Backward compatible? (Yes/No) Yes
Config/env changes? (Yes/No) No
Migration needed? (Yes/No) No
If yes, exact upgrade steps:

Failure Recovery (if this breaks)

How to disable/revert this change quickly: revert this PR.
Files/config to restore: src/media/input-files.ts, src/media/input-files.fetch-guard.test.ts, CHANGELOG.md
Known bad symptoms reviewers should watch for: valid image inputs getting rejected due to incorrect MIME detection.

Risks and Mitigations

Risk: MIME sniffing could reintroduce non-HEIC behavior drift.
- Mitigation: tests assert non-HEIC images keep their declared MIME after validation while concrete non-image detections are rejected.

AI Assistance

AI-assisted: yes
Testing: fully tested on the focused Gateway/media suites listed below

Verification:

pnpm vitest run src/media/input-files.fetch-guard.test.ts src/gateway/openai-http.test.ts src/gateway/openresponses-parity.test.ts src/gateway/openai-http.image-budget.test.ts
pnpm exec oxfmt --check src/media/input-files.ts src/media/input-files.fetch-guard.test.ts CHANGELOG.md

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ffeeba6938

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-06T19:37:49Z

+    (detectedMime && HEIC_INPUT_IMAGE_MIMES.has(detectedMime)) ||
+    (HEIC_INPUT_IMAGE_MIMES.has(declaredMime) && !detectedMime)
+      ? (detectedMime ?? declaredMime)


Honor detected non-HEIC image types before conversion

The new sourceMime selection ignores detectedMime whenever it is an image type other than HEIC/HEIF, so a payload declared as image/heic but sniffed as image/png/image/jpeg is still treated as HEIC and routed through convertHeicToJpeg. In practice (for both base64 and URL inputs with incorrect mediaType/Content-Type), this can recompress or even fail valid non-HEIC images, which regresses the prior behavior and contradicts the intent to scope HEIC normalization to actual HEIC bytes.

Useful? React with 👍 / 👎.

greptile-apps · 2026-03-06T20:03:33Z

Greptile Summary

This PR hardens normalizeInputImage in src/media/input-files.ts by unconditionally calling detectMime on all image inputs and rejecting payloads where the declared MIME is image/* but the detected bytes are a concrete non-image type. The security hardening goal is achieved by the guard on line 241, which correctly prevents non-image bytes from bypassing validation.

However, there is a logic regression in the sourceMime selection (lines 244–248). When declaredMime is "image/heic"/"image/heif" but detectedMime is a different image type (e.g., "image/jpeg"), neither ternary condition is satisfied and the code falls back to sourceMime = declaredMime. This causes convertHeicToJpeg to be called on non-HEIC image bytes, which will fail at runtime. The previous code used detectedMime ?? declaredMime for all HEIC-declared inputs, gracefully handling this case. A test case for this "declared HEIC + detected non-HEIC image" scenario is also missing.

The spoofing-rejection guard is correct and new tests verify the intended behaviour for spoofed base64 and URL payloads. The CHANGELOG.md update is clear and appropriately scoped.

Confidence Score: 2/5

The security hardening goal for spoofed MIME payloads is achieved, but a logic regression will cause runtime errors when HEIC is declared but non-HEIC image bytes are detected.
The spoofing-rejection guard (line 241) correctly prevents non-image bytes from slipping through, and new tests verify this behaviour. However, the sourceMime ternary (lines 244–248) has a regression: when declaredMime is HEIC but detectedMime is a different image type, the code incorrectly sets sourceMime = declaredMime and proceeds to call convertHeicToJpeg on non-HEIC bytes. This will cause runtime failures for valid images with mismatched type declarations. The fix is surgical but critical before merging.
src/media/input-files.ts (lines 244–248) requires the sourceMime ternary fix to handle the "declared HEIC + detected non-HEIC image" case. A test case for this scenario should also be added.

_{Last reviewed commit: ffeeba6}

greptile-apps · 2026-03-06T20:03:37Z

+  const sourceMime =
+    (detectedMime && HEIC_INPUT_IMAGE_MIMES.has(detectedMime)) ||
+    (HEIC_INPUT_IMAGE_MIMES.has(declaredMime) && !detectedMime)
+      ? (detectedMime ?? declaredMime)
+      : declaredMime;


HEIC conversion triggered on non-HEIC image bytes when declared MIME is HEIC

When declaredMime is "image/heic" (or "image/heif") but detectedMime is a non-HEIC image type like "image/jpeg", the current logic sets sourceMime = declaredMime = "image/heic" and proceeds to call convertHeicToJpeg on non-HEIC bytes. This is a regression from the previous behavior.

Trace for declaredMime = "image/heic", detectedMime = "image/jpeg":

Guard check (line 241): both are image/* → no rejection (correct)

Part 1: "image/jpeg" && HEIC_INPUT_IMAGE_MIMES.has("image/jpeg") → false

Part 2: HEIC_INPUT_IMAGE_MIMES.has("image/heic") && !"image/jpeg" → false

Overall condition → false, takes the else branch

sourceMime = declaredMime = "image/heic"

Line 253: tries to convert JPEG bytes as HEIC

The old code handled this gracefully by using detectedMime ?? declaredMime for all HEIC-declared inputs, so a JPEG claimed as HEIC would be returned directly without conversion.

Fix: also use detectedMime when declaredMime is HEIC but a non-HEIC image was actually detected:

Suggested change

const sourceMime =

(detectedMime && HEIC_INPUT_IMAGE_MIMES.has(detectedMime)) ||

(HEIC_INPUT_IMAGE_MIMES.has(declaredMime) && !detectedMime)

? (detectedMime ?? declaredMime)

: declaredMime;

const sourceMime =

(detectedMime && HEIC_INPUT_IMAGE_MIMES.has(detectedMime)) ||

(HEIC_INPUT_IMAGE_MIMES.has(declaredMime) && !detectedMime)

? (detectedMime ?? declaredMime)

: HEIC_INPUT_IMAGE_MIMES.has(declaredMime) && detectedMime

? detectedMime

: declaredMime;

This restores the old behaviour of using the detected MIME for HEIC-declared inputs while preserving all the new spoofing-rejection logic. A corresponding test case for "declared HEIC + detected non-HEIC image" is also missing.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/media/input-files.ts Line: 244-248 Comment: **HEIC conversion triggered on non-HEIC image bytes when declared MIME is HEIC** When `declaredMime` is `"image/heic"` (or `"image/heif"`) but `detectedMime` is a non-HEIC image type like `"image/jpeg"`, the current logic sets `sourceMime = declaredMime = "image/heic"` and proceeds to call `convertHeicToJpeg` on non-HEIC bytes. This is a regression from the previous behavior. Trace for `declaredMime = "image/heic"`, `detectedMime = "image/jpeg"`: - Guard check (line 241): both are `image/*` → no rejection (correct) - Part 1: `"image/jpeg" && HEIC_INPUT_IMAGE_MIMES.has("image/jpeg")` → `false` - Part 2: `HEIC_INPUT_IMAGE_MIMES.has("image/heic") && !"image/jpeg"` → `false` - Overall condition → **`false`**, takes the else branch - **`sourceMime = declaredMime = "image/heic"`** - Line 253: tries to convert JPEG bytes as HEIC The old code handled this gracefully by using `detectedMime ?? declaredMime` for all HEIC-declared inputs, so a JPEG claimed as HEIC would be returned directly without conversion. Fix: also use `detectedMime` when `declaredMime` is HEIC but a non-HEIC image was actually detected: ```suggestion const sourceMime = (detectedMime && HEIC_INPUT_IMAGE_MIMES.has(detectedMime)) || (HEIC_INPUT_IMAGE_MIMES.has(declaredMime) && !detectedMime) ? (detectedMime ?? declaredMime) : HEIC_INPUT_IMAGE_MIMES.has(declaredMime) && detectedMime ? detectedMime : declaredMime; ``` This restores the old behaviour of using the detected MIME for HEIC-declared inputs while preserving all the new spoofing-rejection logic. A corresponding test case for "declared HEIC + detected non-HEIC image" is also missing. How can I resolve this? If you propose a fix, please make it concise.

* main: (37 commits) feat(gateway): add channel-backed readiness probes (openclaw#38285) CI: enable report-only Knip deadcode job Tooling: wire deadcode scripts to Knip Tooling: add Knip workspace config CI: skip detect-secrets on main temporarily Install Smoke: fetch docs base on demand CI: fetch base history on demand CI: add base-commit fetch helper Docs: clarify main secret scan behavior CI: keep full secret scans on main Docs: update secret scan reproduction steps CI: scope secret scans to changed files Media: reject spoofed input_image MIME payloads (openclaw#38289) chore: code/dead tests cleanup (openclaw#38286) Install Smoke: cache docker smoke builds Install Smoke: allow reusing prebuilt test images Install Smoke: shallow docs-scope checkout CI: shallow scope checkouts feat(onboarding): add web search to onboarding flow (openclaw#34009) chore: disable contributor labels ...

* Media: reject spoofed input image MIME types * Media: cover spoofed input image MIME regressions * Changelog: note input image MIME hardening

* Media: reject spoofed input image MIME types * Media: cover spoofed input image MIME regressions * Changelog: note input image MIME hardening (cherry picked from commit 084dfd2)

* Media: reject spoofed input image MIME types * Media: cover spoofed input image MIME regressions * Changelog: note input image MIME hardening

vincentkoc added 3 commits March 6, 2026 14:32

Media: reject spoofed input image MIME types

4259bfb

Media: cover spoofed input image MIME regressions

6c9448c

Changelog: note input image MIME hardening

ffeeba6

vincentkoc self-assigned this Mar 6, 2026

openclaw-barnacle Bot added size: S maintainer Maintainer-authored PR labels Mar 6, 2026

vincentkoc marked this pull request as ready for review March 6, 2026 19:34

vincentkoc merged commit 084dfd2 into main Mar 6, 2026
29 of 30 checks passed

vincentkoc deleted the vincentkoc-code/input-image-mime-spoofing-fix branch March 6, 2026 19:34

chatgpt-codex-connector Bot reviewed Mar 6, 2026

View reviewed changes

vincentkoc mentioned this pull request Mar 6, 2026

Web: add HEIC media regression and doc fix #38294

Merged

18 tasks

greptile-apps Bot reviewed Mar 6, 2026

View reviewed changes

github-actions Bot mentioned this pull request Mar 6, 2026

📡 Upstream Digest — 2026-03-06 20:23 UTC curtismercier/openclaw-mods#195

Open

alexyyyander mentioned this pull request Mar 7, 2026

fix/gateway token mismatch 38617 #38676

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Media: reject spoofed input_image MIME payloads#38289

Media: reject spoofed input_image MIME payloads#38289
vincentkoc merged 3 commits intomainfrom
vincentkoc-code/input-image-mime-spoofing-fix

vincentkoc commented Mar 6, 2026

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Mar 6, 2026

Uh oh!

greptile-apps Bot commented Mar 6, 2026

Uh oh!

greptile-apps Bot Mar 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

vincentkoc commented Mar 6, 2026

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

User-visible / Behavior Changes

Security Impact (required)

Repro + Verification

Environment

Steps

Expected

Actual

Evidence

Human Verification (required)

Compatibility / Migration

Failure Recovery (if this breaks)

Risks and Mitigations

AI Assistance

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot commented Mar 6, 2026

Greptile Summary

Confidence Score: 2/5

Uh oh!

greptile-apps Bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant