Skip to content

Media: sanitize Content-Disposition filenames from remote fetches#27818

Closed
afurm wants to merge 10 commits into
openclaw:mainfrom
afurm:fix/media-content-disposition-filename-sanitize
Closed

Media: sanitize Content-Disposition filenames from remote fetches#27818
afurm wants to merge 10 commits into
openclaw:mainfrom
afurm:fix/media-content-disposition-filename-sanitize

Conversation

@afurm

@afurm afurm commented Feb 26, 2026

Copy link
Copy Markdown
Contributor

Summary

Describe the problem and fix in 2–5 bullets:

  • Problem: fetchRemoteMedia parsed Content-Disposition filenames with path.basename(), which does not treat Windows \ separators as path separators on POSIX hosts.
  • Why it matters: a remote server can return filenames like ..\\..\\secret.pdf, and traversal-like segments can propagate into downstream attachment metadata/labels instead of being reduced to a safe basename.
  • What changed: added a filename sanitizer for Content-Disposition parsing that normalizes \ to /, strips control characters, and then takes a basename; applied to both filename= and RFC5987 filename*= paths; added regression tests.
  • What did NOT change (scope boundary): no changes to SSRF/network policy, file-write destinations, storage layout, or media fetch request behavior.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #
  • Related #

User-visible / Behavior Changes

  • Remote media downloads now sanitize malicious/invalid Content-Disposition filenames that use Windows-style traversal separators (for example ..\\..\\file.pdf becomes file.pdf).
  • No change for normal filenames.

Security Impact (required)

  • New permissions/capabilities? (Yes/No) No
  • Secrets/tokens handling changed? (Yes/No) No
  • New/changed network calls? (Yes/No) No
  • Command/tool execution surface changed? (Yes/No) No
  • Data access scope changed? (Yes/No) No
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: macOS (local dev machine)
  • Runtime/container: Node + pnpm + Vitest (no container)
  • Model/provider: N/A
  • Integration/channel (if any): N/A (unit test coverage on shared media helper)
  • Relevant config (redacted): None

Steps

  1. Mock fetchRemoteMedia() to receive Content-Disposition: attachment; filename="..\\..\\secret.pdf".
  2. Call fetchRemoteMedia() and inspect result.fileName.
  3. Repeat with filename*=UTF-8''..%5C..%5Creport-final.pdf.

Expected

  • Returned filename is reduced to a safe basename (secret.pdf / report-final.pdf), without traversal-like segments.

Actual

  • Before fix (pre-patch): POSIX path.basename() preserved backslash-delimited segments, so ..\\..\\secret.pdf could survive as-is.
  • After fix: both header forms are sanitized to the basename.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Test snippet (after patch):

pnpm test src/media/fetch.test.ts✓ src/media/fetch.test.ts (5 tests)

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios: ran targeted unit tests for /Users/afurm/Development/openclaw/src/media/fetch.test.ts; confirmed new cases pass for both filename= and filename*= handling.
  • Edge cases checked: Windows-style backslash separators, percent-encoded backslashes in RFC5987 filename*.
  • What you did not verify: full repo test suite, live channel/integration end-to-end flows, Windows runtime behavior.

Compatibility / Migration

  • Backward compatible? (Yes/No) Yes
  • Config/env changes? (Yes/No) No
  • Migration needed? (Yes/No) No
  • If yes, exact upgrade steps: N/A

Failure Recovery (if this breaks)

  • How to disable/revert this change quickly: revert commit 521f3e428 (or restore the two touched files and redeploy).
  • Files/config to restore: /Users/afurm/Development/openclaw/src/media/fetch.ts, /Users/afurm/Development/openclaw/src/media/fetch.test.ts
  • Known bad symptoms reviewers should watch for: unexpected filename truncation/empty filenames from unusual servers (fallback naming should still apply downstream).

Risks and Mitigations

List only real risks for this PR. Add/remove entries as needed. If none, write None.

  • Risk: Some non-malicious servers may intentionally include backslashes/control characters in display filenames, and these will now be normalized/stripped.
  • Mitigation: Sanitization is limited to untrusted Content-Disposition filename parsing only; network behavior and storage paths are unchanged, and regression tests cover both standard and RFC5987 header formats.

@afurm afurm marked this pull request as ready for review February 26, 2026 18:07

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 521f3e4281

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/media/fetch.ts Outdated
@greptile-apps

greptile-apps Bot commented Feb 26, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

Adds filename sanitization to prevent Windows-style path traversal in Content-Disposition headers parsed on POSIX hosts.

  • Introduced sanitizeContentDispositionFileName helper that normalizes backslashes to forward slashes, strips control characters, and extracts basename using path.posix.basename
  • Applied to both standard filename= and RFC5987 filename*= header formats
  • Added regression tests covering Windows-style traversal (..\\..\\secret.pdf) and percent-encoded backslashes (..%5C..%5C)
  • No changes to network behavior, storage layout, or file-write destinations
  • Normal filenames without traversal segments or control characters pass through unchanged

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • Focused security hardening fix with clear implementation, comprehensive test coverage for both standard and RFC5987 Content-Disposition formats, proper Unicode handling, and no breaking changes to existing functionality
  • No files require special attention

Last reviewed commit: 521f3e4

@afurm

afurm commented Feb 26, 2026

Copy link
Copy Markdown
Contributor Author

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 47c9f247cd

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/media/fetch.ts Outdated
@afurm

afurm commented Feb 26, 2026

Copy link
Copy Markdown
Contributor Author

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: aa9f075748

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/media/fetch.ts
Comment thread src/media/fetch.ts
Comment on lines +77 to +79
if (char === "/") {
segment = "";
continue;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3 Badge Preserve last segment when value ends with a separator

The new basename helper clears segment whenever it sees /, so inputs ending in a slash now sanitize to an empty string (for example filename="foo/bar/" or filename*=...foo%2Fbar%2F). Before this commit, path.basename returned bar, so this is a regression that can silently discard the server-provided filename and force fallback naming in fetchRemoteMedia.

Useful? React with 👍 / 👎.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2dca557b79

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/media/fetch.ts

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cb03c0f298

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/media/fetch.ts
Comment on lines +85 to +87
if (next !== '"' && next !== "'") {
segment = "";
continue;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve multiple escaped backslashes before quotes

basenameWithWindowsSeparatorHeuristic only keeps a backslash when the immediate next character is a quote, so after unescapeHttpQuotedPair a decoded sequence like \\\" (two literal backslashes before a quote) is mishandled: the first backslash is treated as a separator and clears the accumulated basename. In a valid quoted-string filename that contains two backslashes before a quote, this truncates the filename stem (e.g. foo is dropped and only \"bar.txt remains), which is a behavioral regression for compliant Content-Disposition values.

Useful? React with 👍 / 👎.

@openclaw-barnacle

Copy link
Copy Markdown

This pull request has been automatically marked as stale due to inactivity.
Please add updates or it will be closed.

@openclaw-barnacle openclaw-barnacle Bot added the stale Marked as stale due to inactivity label Apr 6, 2026
@openclaw-barnacle openclaw-barnacle Bot removed the stale Marked as stale due to inactivity label Apr 8, 2026
@openclaw-barnacle

Copy link
Copy Markdown

This pull request has been automatically marked as stale due to inactivity.
Please add updates or it will be closed.

@openclaw-barnacle openclaw-barnacle Bot added the stale Marked as stale due to inactivity label Apr 13, 2026
@openclaw-barnacle

Copy link
Copy Markdown

Closing due to inactivity.
If you believe this PR should be revived, post in #pr-thunderdome-dangerzone on Discord to talk to a maintainer.
That channel is the escape hatch for high-quality PRs that get auto-closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size: S stale Marked as stale due to inactivity

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant