Skip to content

fix(gateway): recognize markdown files in MEDIA tags#29374

Closed
Rocky-Y008 wants to merge 1 commit into
NousResearch:mainfrom
Rocky-Y008:fix/media-markdown-attachments
Closed

fix(gateway): recognize markdown files in MEDIA tags#29374
Rocky-Y008 wants to merge 1 commit into
NousResearch:mainfrom
Rocky-Y008:fix/media-markdown-attachments

Conversation

@Rocky-Y008

Copy link
Copy Markdown

Summary

  • Add .md to the MEDIA: attachment extension allowlist.
  • Add a regression test that verifies markdown files are extracted as native media attachments instead of remaining in visible text.

Why this matters

Today, MEDIA:/path/to/report.md is not recognized by the MEDIA: parser even though Hermes documents MEDIA: as supporting local files and the gateway document path can upload markdown through send_document().

Without this fix, a generated markdown report is sent as literal text like:

MEDIA:/path/to/report.md

For users, that is awkward: the artifact exists locally, but it does not arrive as a downloadable file. The practical workaround is to manually package the markdown into a .zip because .zip is already in the allowlist. That adds friction for a common agent workflow: generate a report as markdown and send it back through a messaging gateway.

This keeps .md behavior consistent with other document extensions already supported by MEDIA: such as .txt, .pdf, .docx, and .zip.

Test Plan

  • python -m pytest tests/gateway/test_base_media_document_extensions.py -q -o 'addopts='
  • git diff --check

@alt-glitch alt-glitch added type/bug Something isn't working P3 Low — cosmetic, nice to have comp/gateway Gateway runner, session dispatch, delivery labels May 20, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

This is a strict subset of #24384 which already adds .md (along with .html, .svg, .json, .py, and others) to the MEDIA extraction allowlist, and also fixes two additional regex bugs (prefix-glue across newlines, space-truncation). Consider closing in favor of #24384.

@teknium1

Copy link
Copy Markdown
Contributor

Superseded by #34844, which consolidates this cluster.

This PR widens the extract_media extension allowlist, which is the right direction — but on its own it leaves the unconditional MEDIA:\s*\S+ strip in place, so a MEDIA: tag with any extension still outside the (now wider) list keeps getting deleted from the body before extract_local_files can pick up the bare path. #34844 fixes both halves: it unifies the two extractors onto a single shared extension set (MEDIA_DELIVERY_EXTS) AND replaces the loose strip with an extension-anchored one, so an unknown-extension path survives in the text instead of vanishing.

Closing as superseded — thanks for surfacing and helping pin down this bug; it was part of getting the full fix right. See #34844.

@teknium1 teknium1 closed this May 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery P3 Low — cosmetic, nice to have type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants