Skip to content

fix: add md, markdown, json, yaml, yml, toml, log to MEDIA: file regex#32398

Closed
zywgit1 wants to merge 1 commit into
NousResearch:mainfrom
zywgit1:fix/add-md-to-media-regex
Closed

fix: add md, markdown, json, yaml, yml, toml, log to MEDIA: file regex#32398
zywgit1 wants to merge 1 commit into
NousResearch:mainfrom
zywgit1:fix/add-md-to-media-regex

Conversation

@zywgit1

@zywgit1 zywgit1 commented May 26, 2026

Copy link
Copy Markdown

Problem

The MEDIA: file-attachment regex in gateway/platforms/base.py only matched a limited set of file extensions (images, videos, audio, archives, office docs, txt). Common text-based formats like .md, .json, .yaml, .toml, and .log were silently dropped — the gateway reported success but the file was never actually sent.

Fix

Added md|markdown|json|yaml|yml|toml|log to the media extension regex pattern.

Testing

  • Verified that .md files are now correctly sent via Feishu send_message
  • Confirmed existing extensions still match

Changed

  • gateway/platforms/base.py L2416: Extended the media file extension regex

The MEDIA:<path> extraction regex in extract_media() only recognized
.txt and .csv as plain-text document extensions. This meant files with
extensions like .md, .json, .yaml, .toml, and .log could never be
delivered as file attachments — the regex silently dropped them.

Added support for: md, markdown, json, yaml, yml, toml, log
@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/gateway Gateway runner, session dispatch, delivery labels May 26, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Duplicate of #29609 which dynamically derives the regex extension set from SUPPORTED_DOCUMENT_TYPES rather than adding extensions one-by-one. See also #22492, #32358, #32294 — this is a saturated cluster with 5+ competing PRs for the same fix.

@teknium1

Copy link
Copy Markdown
Contributor

Superseded by #34844, which consolidates this cluster.

This PR widens the extract_media extension allowlist, which is the right direction — but on its own it leaves the unconditional MEDIA:\s*\S+ strip in place, so a MEDIA: tag with any extension still outside the (now wider) list keeps getting deleted from the body before extract_local_files can pick up the bare path. #34844 fixes both halves: it unifies the two extractors onto a single shared extension set (MEDIA_DELIVERY_EXTS) AND replaces the loose strip with an extension-anchored one, so an unknown-extension path survives in the text instead of vanishing.

Closing as superseded — thanks for surfacing and helping pin down this bug; it was part of getting the full fix right. See #34844.

@teknium1 teknium1 closed this May 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants