Skip to content

[Bug]: Uppercase extensions in MEDIA: tags produce empty messages #30526

@AnthonyFirth

Description

@AnthonyFirth

Bug Description

MEDIA: tags whose file path has an uppercase extension (e.g. .JPG, .PNG, .MP4) are silently dropped. The file is never delivered, and the raw MEDIA:… text is stripped from the message body later, so if a response was just the MEDIA: tag then the user receives no message.

Steps to Reproduce

  1. Place a real file at e.g. /Users/me/photo.JPG (note the uppercase extension)
  2. Get the agent to emit MEDIA:/Users/me/photo.JPG in a response, eg. ask it directly to send that file by absolute path on any gateway platform.
  3. Observe the message the user receives.

Repeat with /Users/me/photo.jpg (lowercase) to confirm it otherwise works with a lowercase extension

Expected Behavior

The file is uploaded natively to the platform, identically to the .jpg case. The extractor should treat the extension list as case-insensitive.

Actual Behavior

No file is delivered. The MEDIA:/Users/me/photo.JPG text is also stripped from the response body, so the user sees an empty message (or surrounding prose with no file attachment and no indication anything was supposed to be sent). Nothing is logged at a level that surfaces the silent drop.

Affected Component

Gateway

Messaging Platform

All gateway platforms. The bug is in the shared base class (gateway/platforms/base.py) not in any platform adapter.

Root Cause Analysis

At gateway/platforms/base.py:2161 the re.compile(...) call for media_pattern has no flags argument. The extension alternation is:

(?:png|jpe?g|gif|webp|mp4|mov|avi|mkv|webm|ogg|opus|mp3|wav|m4a|flac|epub|pdf|zip|rar|7z|docx?|xlsx?|pptx?|txt|csv|apk|ipa)

which matches all lowercase. Uppercase paths never match, so extract_media returns an empty media list and leaves the response text unchanged.

The miss is then masked by two downstream cleanups that strip MEDIA:\s*\S+ unconditionally (no extension list, no case constraint):

  • gateway/platforms/base.py:3175text_content = re.sub(r"MEDIA:\s*\S+","", text_content).strip()
  • gateway/stream_consumer.py:645_MEDIA_RE = re.compile(r'''[`"']?MEDIA:\s*\S+[`"']?''')

Because these strips succeed where the extractor failed, the user is left with no file and no visible tag meaning no signal that anything was lost.

For reference, the companion extract_local_files at gateway/platforms/base.py:2227 (which detects bare local paths in agent output) is already compiled with re.IGNORECASE, so it doesn't exhibit this issue.

Proposed Fix

Add re.IGNORECASE to the re.compile() call at gateway/platforms/base.py:2161:

media_pattern = re.compile(
    r'''[`"']?MEDIA:\s*(?P<path>`[^`\n]+`|"[^"\n]+"|'[^'\n]+'|(?:~/|/)\S+(?:[^\S\n]+\S+)*?\.(?:png|jpe?g|gif|webp|mp4|mov|avi|mkv|webm|ogg|opus|mp3|wav|m4a|flac|epub|pdf|zip|rar|7z|docx?|xlsx?|pptx?|txt|csv|apk|ipa)(?=[\s`"',;:)\]}]|$))[`"']?''',
    re.IGNORECASE,
)

Are you willing to submit a PR for this?

  • I'd like to fix this myself and submit a PR

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium — degraded but workaround existscomp/gatewayGateway runner, session dispatch, deliverytype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions