Bug: extract_media() doesn't support .html/.htm extensions — inconsistent with extract_local_files()
Summary
After the "deliverable mode" PR (#27813, commit f2fdb9a1), MEDIA:/path/to/file.html no longer delivers the file as a native attachment. The path is left as literal text in the response instead.
Root Cause
The PR removed a |\S+ fallback catch-all from the extract_media() regex in gateway/platforms/base.py to reduce false positives. However, it did not add .html/.htm to the explicit extension list.
Meanwhile, the same PR did add .html/.htm to extract_local_files() (the bare-path detector). This means:
| Method |
Input example |
.html supported? |
extract_media() |
MEDIA:/home/user/report.html |
❌ No |
extract_local_files() |
/home/user/report.html |
✅ Yes |
Both methods ultimately route through the same send_document() call, so there's no technical reason to exclude .html from extract_media().
Additionally, gateway/run.py has two inline _TOOL_MEDIA_RE regex patterns (around lines 16562 and 16858) used to scan tool results for MEDIA tags — these also lack .html/.htm.
Steps to Reproduce
- Have Hermes running with Telegram connected.
- Ask the agent to send an
.html file using MEDIA: tag, e.g.:
MEDIA:/home/user/report.html
- Observe: the response text contains the literal
MEDIA:/home/user/report.html string — no file is delivered.
Expected Behavior
The .html file should be delivered as a native document attachment, consistent with how extract_local_files() already handles bare .html paths.
Affected Files
gateway/platforms/base.py — extract_media() regex (media_pattern)
gateway/run.py — two _TOOL_MEDIA_RE inline regex patterns
Suggested Fix
Add html? to the extension alternatives in all three locations:
base.py — extract_media() (~line 2161):
- ...csv|apk|ipa)(?=...
+ ...csv|apk|ipa|html?)(?=...
run.py — _TOOL_MEDIA_RE (~lines 16563 and 16860):
- ...csv|apk|ipa))',
+ ...csv|apk|ipa|html?))',
Environment
- Hermes Agent commit:
edb2d9105 (present in origin/main as of 2026-05-23)
- Introduced in: commit
f2fdb9a1 ("feat(gateway): deliverable mode", 2026-05-18)
Bug:
extract_media()doesn't support.html/.htmextensions — inconsistent withextract_local_files()Summary
After the "deliverable mode" PR (#27813, commit
f2fdb9a1),MEDIA:/path/to/file.htmlno longer delivers the file as a native attachment. The path is left as literal text in the response instead.Root Cause
The PR removed a
|\S+fallback catch-all from theextract_media()regex ingateway/platforms/base.pyto reduce false positives. However, it did not add.html/.htmto the explicit extension list.Meanwhile, the same PR did add
.html/.htmtoextract_local_files()(the bare-path detector). This means:.htmlsupported?extract_media()MEDIA:/home/user/report.htmlextract_local_files()/home/user/report.htmlBoth methods ultimately route through the same
send_document()call, so there's no technical reason to exclude.htmlfromextract_media().Additionally,
gateway/run.pyhas two inline_TOOL_MEDIA_REregex patterns (around lines 16562 and 16858) used to scan tool results for MEDIA tags — these also lack.html/.htm.Steps to Reproduce
.htmlfile usingMEDIA:tag, e.g.:MEDIA:/home/user/report.htmlMEDIA:/home/user/report.htmlstring — no file is delivered.Expected Behavior
The
.htmlfile should be delivered as a native document attachment, consistent with howextract_local_files()already handles bare.htmlpaths.Affected Files
gateway/platforms/base.py—extract_media()regex (media_pattern)gateway/run.py— two_TOOL_MEDIA_REinline regex patternsSuggested Fix
Add
html?to the extension alternatives in all three locations:base.py—extract_media()(~line 2161):run.py—_TOOL_MEDIA_RE(~lines 16563 and 16860):Environment
edb2d9105(present inorigin/mainas of 2026-05-23)f2fdb9a1("feat(gateway): deliverable mode", 2026-05-18)