Skip to content

fix(gateway): extract markdown media attachments#31754

Closed
hclsys wants to merge 1 commit into
NousResearch:mainfrom
hclsys:fix/media-markdown-attachments-31560
Closed

fix(gateway): extract markdown media attachments#31754
hclsys wants to merge 1 commit into
NousResearch:mainfrom
hclsys:fix/media-markdown-attachments-31560

Conversation

@hclsys

@hclsys hclsys commented May 25, 2026

Copy link
Copy Markdown

Fixes #31560.

Summary

  • include .md, .html, and .htm in shared MEDIA: extraction
  • make MEDIA: extension matching case-insensitive so paths like REPORT.MD are delivered
  • keep gateway tool-result MEDIA scanners aligned with the shared extractor
  • add HTML MIME entries to the shared document type map

Pre-implement audit

  • Existing-helper check: reused BasePlatformAdapter.extract_media() and SUPPORTED_DOCUMENT_TYPES; no new extractor/helper added.
  • Shared-helper caller check: reviewed gateway/tool callers of extract_media() and aligned the two gateway/run.py tool-result scanners.
  • Broader-fix rival scan: no open PR found for Gateway MEDIA tags do not recognize markdown attachments #31560 or the markdown/HTML MEDIA extraction gap.

Tests

  • python -m pytest -o addopts='' tests/gateway/test_platform_base.py::TestExtractMedia -q
  • python -m pytest -o addopts='' tests/gateway/test_platform_base.py::TestExtractMedia tests/gateway/test_document_cache.py -q
  • python -m py_compile gateway/platforms/base.py gateway/run.py
  • git diff --check

Note

tests/gateway/test_telegram_documents.py::TestSendDocument could not run in my local environment because pytest reports async tests need a plugin such as pytest-asyncio; this patch does not change Telegram send_document behavior.

@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/gateway Gateway runner, session dispatch, delivery duplicate This issue or pull request already exists labels May 25, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Duplicate of #31561 which implements the same markdown/HTML MEDIA extraction fix for #31560 (adds .md, .html, .htm to extract_media() regex, case-insensitive matching).

@hclsys

hclsys commented May 25, 2026

Copy link
Copy Markdown
Author

Closing in favor of #31561 by @evgyur, which landed first and covers the same .md/.MD/.html/.htm + case-insensitive extraction for #31560. One thing worth folding into #31561 if it isn't already: the gateway tool-result MEDIA scanner in gateway/run.py uses the same extractor, so the case-insensitive fix should flow through there too — worth a quick check. Thanks @evgyur.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery duplicate This issue or pull request already exists P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Gateway MEDIA tags do not recognize markdown attachments

2 participants