Skip to content

fix(gateway): add .md to MEDIA attachment extension whitelist#33127

Closed
potatogim wants to merge 1 commit into
NousResearch:mainfrom
potatogim:fix/media-md-extension
Closed

fix(gateway): add .md to MEDIA attachment extension whitelist#33127
potatogim wants to merge 1 commit into
NousResearch:mainfrom
potatogim:fix/media-md-extension

Conversation

@potatogim

Copy link
Copy Markdown

Problem

The MEDIA:<path> tag extraction regex silently drops .md files. When an agent (or skill) emits MEDIA:/path/to/file.md, the gateway never routes it to send_document() — the path is stripped from the response and the file is never delivered to the user.

This affects all platform adapters (Telegram, Discord, Slack, etc.) since the filtering happens in the shared gateway layer.

Root Cause

Three locations define the allowed file-extension regex:

File Lines
gateway/run.py ~16855
gateway/run.py ~17161
gateway/platforms/base.py ~2416

All three include txt and csv as plain-text formats but omit md. This appears to be an oversight from commit ea49b38 which tightened the regex — the existing extension list was copied without adding Markdown.

Fix

Add md alongside txt and csv in all three regex patterns.

Diff

- r'txt|csv|apk|ipa))',
+ r'txt|csv|md|apk|ipa))',

3 insertions, 3 deletions — no behavioral change for any other format.

Testing

  • Manual: MEDIA:/tmp/test.md now triggers send_document() on Telegram
  • Existing test suite: ./scripts/run_tests.sh passes (no tests depend on the extension list being exclusive)

Checklist

  • I have read the CONTRIBUTING.md document
  • My code follows the code style of this project
  • All new and existing tests passed
  • I have added documentation where necessary

The MEDIA: path extraction regex in gateway/run.py (2 locations) and
gateway/platforms/base.py (1 location) was missing the "md" extension,
causing Markdown files referenced via MEDIA:<path>.md tags to be silently
ignored instead of being sent as document attachments.

This commit adds "md" alongside the existing "txt" and "csv" text formats.

Signed-off-by: Jihyeon Gim <potatogim@gluesys.com>
@alt-glitch alt-glitch added type/bug Something isn't working P3 Low — cosmetic, nice to have comp/gateway Gateway runner, session dispatch, delivery duplicate This issue or pull request already exists labels May 27, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Duplicate of #29609 (open, preferred approach) which dynamically derives extensions from SUPPORTED_DOCUMENT_TYPES instead of hardcoding. Part of the saturated MEDIA regex extension whitelist cluster — also see #33089, #30588, #32995, #30106.

@teknium1

Copy link
Copy Markdown
Contributor

Superseded by #34844, which consolidates this cluster.

This PR widens the extract_media extension allowlist, which is the right direction — but on its own it leaves the unconditional MEDIA:\s*\S+ strip in place, so a MEDIA: tag with any extension still outside the (now wider) list keeps getting deleted from the body before extract_local_files can pick up the bare path. #34844 fixes both halves: it unifies the two extractors onto a single shared extension set (MEDIA_DELIVERY_EXTS) AND replaces the loose strip with an extension-anchored one, so an unknown-extension path survives in the text instead of vanishing.

Closing as superseded — thanks for surfacing and helping pin down this bug; it was part of getting the full fix right. See #34844.

@teknium1 teknium1 closed this May 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery duplicate This issue or pull request already exists P3 Low — cosmetic, nice to have type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants