Skip to content

fix(gateway): support Windows drive-letter paths in MEDIA: tags and extract_local_files#30778

Open
fanyi-zhao wants to merge 1 commit into
NousResearch:mainfrom
fanyi-zhao:fix/windows-media-paths
Open

fix(gateway): support Windows drive-letter paths in MEDIA: tags and extract_local_files#30778
fanyi-zhao wants to merge 1 commit into
NousResearch:mainfrom
fanyi-zhao:fix/windows-media-paths

Conversation

@fanyi-zhao

@fanyi-zhao fanyi-zhao commented May 23, 2026

Copy link
Copy Markdown

What does this PR do?

Fixes Windows drive-letter path support for MEDIA: tags and extract_local_files.
On Windows, paths like MEDIA:D:\path\to\file.png and bare paths like D:\dev\...\screen.png were silently ignored because the regex patterns only matched Unix-style paths (/abs/path or ~/home/path).

Related Issue

N/A (no existing issue — discovered during Discord file attachment testing on Windows)

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)

Changes Made

  • gateway/platforms/base.pyextract_media(): (?:~/|/)(?:[A-Za-z]:[\\/]|~/|/)
  • gateway/platforms/base.pyextract_local_files(): prefix + /[\\/] path separator
  • gateway/run.py_TOOL_MEDIA_RE (×2): (?:/|~\/)(?:[A-Za-z]:[\\/]|/|~\/)
  • tests/gateway/test_extract_local_files.py — 5 new Windows path tests
  • tests/gateway/test_media_extraction.py — 6 new Windows/Unix/URL tests

How to Test

  1. On Windows with a Discord bot configured, send MEDIA:D:\path\to\image.png via send_message
  2. Verify the file is delivered as a native attachment (not raw text)
  3. Verify bare paths like D:\dev\...\file.pdf are detected in responses
  4. Run pytest tests/gateway/test_extract_local_files.py tests/gateway/test_media_extraction.py

Checklist

Code

Documentation & Housekeeping

  • I've considered cross-platform impact (Windows, macOS) — N/A (no regression)
  • I've updated tool descriptions/schemas — N/A

…tract_local_files

On Windows, MEDIA:D:\path\to\file.png tags and bare local file paths
(D:\dev\...\screen.png) were not detected because the regex patterns
only matched Unix-style paths (/abs/path or ~/home/path).

Changes:
- extract_media: add [A-Za-z]:[\\/] for Windows drive-letter prefix
- extract_local_files: add drive-letter prefix + [\\/] path separator
- run.py _TOOL_MEDIA_RE: add drive-letter prefix (×2 occurrences)

Tests added:
- test_extract_local_files: Windows backslash, forward slash, bare drive,
  UNC path, text cleanup
- test_media_extraction: Windows path extraction, Unix path preservation,
  HTTP URL rejection

Safety: all paths validated with os.path.exists()/isfile(); no Linux
regression — [\\/] matches both \ and /.
@fanyi-zhao fanyi-zhao force-pushed the fix/windows-media-paths branch from 66e4109 to 412157c Compare May 23, 2026 06:45
@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/gateway Gateway runner, session dispatch, delivery labels May 23, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Note: this PR enters a crowded space with several competing fixes for the same Windows drive-letter MEDIA: path bug (#28989, #24032):

Maintainers should pick one and close the rest to avoid merge conflicts.

@fanyi-zhao fanyi-zhao changed the title fix: support Windows drive-letter paths in MEDIA: tags and extract_local_files fix(gateway): support Windows drive-letter paths in MEDIA: tags and extract_local_files May 23, 2026
@fanyi-zhao

Copy link
Copy Markdown
Author

Note: this PR enters a crowded space with several competing fixes for the same Windows drive-letter MEDIA: path bug (#28989, #24032):

Maintainers should pick one and close the rest to avoid merge conflicts.

@alt-glitch Thanks for surfacing the full landscape — really helpful context. I agree maintainers should consolidate.

Here's how the three active PRs compare:

PR extract_media extract_local_files Extras
#30778 None
#24049 +spaced paths, +.json/.xml/.html extensions
#28991 None

#30778 is the only one that covers both entry points in a single, minimal change (each regex just gains a [A-Za-z]:[\\/] alternative). No extension list changes, no unrelated additions — pure bug fix.

If instead the plan is to merge #24049 + #28991, I think #24049 has two concerns worth flagging:

  1. Overly permissive regex: uses [A-Za-z]: rather than [A-Za-z]:[\\/]. This would match C:Users\file.pdf — a valid Windows relative path that shouldn't be treated as absolute. fix(gateway): support Windows drive-letter paths in MEDIA: tags and extract_local_files #30778's stricter [A-Za-z]:[\\/] avoids that false positive.

  2. Extension list creep under the "GIS" label: adds .json, .xml, and .html to the MEDIA tag allowlist. These aren't GIS-specific — they're ubiquitous general formats. Adding them means any .json/.xml/.html path in an agent response gets routed through os.path.isfile(), increasing overhead and noise. If those formats are desired they deserve their own PR with dedicated discussion, not a sidecar to a Windows path fix.

Suggestion: merge #30778 as the canonical fix, close #24049 and #28991. The spaced-path support and any extension additions can be evaluated as standalone follow-ups.

@fanyi-zhao

Copy link
Copy Markdown
Author

@teknium1 @OutThisLife @alt-glitch Could you please review this pull request when you get a chance? Your expertise in this area would be greatly appreciated. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants