Skip to content

fix(gateway): support Windows absolute paths in all MEDIA regex locations (#34632)#35161

Closed
Tranquil-Flow wants to merge 1 commit into
NousResearch:mainfrom
Tranquil-Flow:fix/34632-media-windows
Closed

fix(gateway): support Windows absolute paths in all MEDIA regex locations (#34632)#35161
Tranquil-Flow wants to merge 1 commit into
NousResearch:mainfrom
Tranquil-Flow:fix/34632-media-windows

Conversation

@Tranquil-Flow

Copy link
Copy Markdown
Contributor

What does this PR do?

The MEDIA_TAG_CLEANUP_RE, extract_local_files path regex, and two
_TOOL_MEDIA_RE patterns in gateway/run.py all used (?:~/|/) to
anchor paths, matching only Unix-style absolute and home-relative paths.
Windows absolute paths (C:\Users\..., D:/...) were silently ignored,
causing MEDIA directive delivery to fail on Windows.

Fix: Add [A-Za-z]:[/\\] as a third anchor alternative in all four
regex locations. Also update path separators in extract_local_files from
/ to [/\\] so it can traverse Windows directory trees.

This covers all 4 locations that competing PR #30778 covers, without
hardcoding the extension list (uses the shared _MEDIA_EXT_ALTERNATION
constant where available, and identical inline extension patterns in the
_TOOL_MEDIA_RE cases where the constant is out of scope).

Related Issue

Fixes #34632

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)

Changes Made

  • gateway/platforms/base.py: Add [A-Za-z]:[/\\] anchor to
    MEDIA_TAG_CLEANUP_RE and extract_local_files path regex. Revert
    accidental + quantifier in lookahead (unrelated cosmetic change).
  • gateway/run.py: Add [A-Za-z]:[/\\] anchor to both _TOOL_MEDIA_RE
    patterns — history scanning (L17223) and result scanning (L17549). These
    were the locations missed by the original build.
  • tests/gateway/test_platform_base.py: 5 existing tests for extract_media
    Windows path recognition.
  • tests/gateway/test_run_tool_media_re.py: New — 31 tests covering
    Windows backslash/forward-slash, mixed separators, multiple extensions,
    case-insensitive drive letters/extensions, multi-tag extraction, negative
    cases, and fail-without-fix proof using pre-fix pattern.

How to Test

  1. Run python3 -m pytest tests/gateway/test_run_tool_media_re.py -v (31 new tests)
  2. Run python3 -m pytest tests/gateway/test_platform_base.py -v (existing tests, no regressions)
  3. Verify MEDIA:C:\Users\test\image.png is recognized
  4. Verify MEDIA:D:/data/report.pdf is recognized
  5. Verify Unix paths (/tmp/file.png, ~/docs/report.pdf) still work

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix/feature (no unrelated commits)
  • I've run pytest tests/ -q and all tests pass
  • I've added tests for my changes (required for bug fixes, strongly encouraged for features)

…xtract_local_files (NousResearch#34632)

The MEDIA_TAG_CLEANUP_RE and extract_local_files path regex both used
(?:~/|/) to anchor paths, which only matches Unix-style absolute and
home-relative paths. Two additional _TOOL_MEDIA_RE patterns in run.py
had the same limitation. Windows absolute paths (C:\Users\..., D:/...)
were silently ignored, causing MEDIA directive delivery to fail.

Add [A-Za-z]:[/\\] as a third anchor alternative in all four regex
locations (base.py x2, run.py x2). Also update path separators in
extract_local_files from / to [/\\] so it can traverse Windows
directory trees.

Revert accidental + quantifier in MEDIA_TAG_CLEANUP_RE lookahead
that changed match-one to match-one-or-more (unrelated to fix).

Fixes: NousResearch#34632
@alt-glitch alt-glitch added type/bug Something isn't working comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists labels May 30, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Competing fix for #28989/#24032 (Windows MEDIA paths). More comprehensive than #35162 — also patches _TOOL_MEDIA_RE in gateway/run.py. Other open competing PRs: #30778, #24049, #28991, #34021.

@tonydwb tonydwb left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

Verdict: ✅ Approved

Overview

Thorough fix for Windows absolute path support across all 4 MEDIA regex locations in the gateway code. 194 additions, 5 deletions, excellent test coverage.

What's Right

  • Complete coverage: All 4 regex locations updated — MEDIA_TAG_CLEANUP_RE, extract_local_files path_re, and both _TOOL_MEDIA_RE patterns in run.py. No misses.
  • Test quality: 31 tests in test_run_tool_media_re.py covering Windows backslash, forward-slash, mixed separators, case-insensitivity, multi-tag, negative cases, AND a pre-fix pattern proof that the fix is necessary.
  • Regression prevention: 5 existing tests extended in test_platform_base.py plus the Unix path tests prove existing behavior is preserved.
  • Clean diff: No unrelated changes, no test deletions. Just the bug fix + tests.
  • Documentation: Comments updated at all regex locations explaining the anchors.
  • Commit message: Conventional Commits format, references the issue.

Reviewed by Hermes Agent (cron)

@teknium1

Copy link
Copy Markdown
Contributor

Merged via #35388. Your commit was cherry-picked onto current main with your authorship preserved in git log (51d165a). We added two test follow-ups: a stale POSIX-only Windows-path test that needed updating to assert the new matching behavior, and a raw-docstring fix to silence an escape warning. Thanks for the fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: MEDIA directive silently fails on Windows

4 participants