Skip to content

fix: add html to extract_media extension list for native file delivery#28973

Closed
sebastiannicolas-analytia wants to merge 1 commit into
NousResearch:mainfrom
sebastiannicolas-analytia:fix/extract-media-lookahead-regex
Closed

fix: add html to extract_media extension list for native file delivery#28973
sebastiannicolas-analytia wants to merge 1 commit into
NousResearch:mainfrom
sebastiannicolas-analytia:fix/extract-media-lookahead-regex

Conversation

@sebastiannicolas-analytia

Copy link
Copy Markdown

Summary

One-line fix to add html to the extension list in BasePlatformAdapter.extract_media() regex.

Problem

The extract_media static method detects MEDIA:<path> tags in agent responses and extracts file paths for native platform delivery. The bare-path alternative in the regex matches files by extension, but html was missing from the alternation list.

This meant that when an agent generated artifacts like MEDIA:/path/to/dashboard.html, the file was never detected for native delivery.

Fix

Add html| at the start of the extension alternation in gateway/platforms/base.py:2162:

- \.(?:png|jpe?g|gif|webp|...
+ \.(?:html|png|jpe?g|gif|webp|...

Verification

Tested with the Python re module — the fixed regex now correctly matches paths like MEDIA:/path/to/file.html at end-of-string.

The extract_media regex in BasePlatformAdapter detects MEDIA:<path> tags
in agent responses to deliver files as native platform attachments. The
extension list in the bare-path alternative was missing 'html', so paths
like MEDIA:/path/to/file.html were not captured for native delivery.

Add 'html|' to the extension alternation so the regex matches .html
files alongside the existing image/video/audio/document extensions.
@alt-glitch alt-glitch added type/bug Something isn't working P3 Low — cosmetic, nice to have comp/gateway Gateway runner, session dispatch, delivery labels May 19, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Duplicate of #22492, which already adds html (along with md/json/xml/log/yaml/yml/toml) to the extract_media() regex. Also overlaps with #24384 and #24049 which include html in broader regex fixes.

@sebastiannicolas-analytia

Copy link
Copy Markdown
Author

Closing as duplicate of #22492 / #24384 / #24049 — thanks for the heads-up!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery P3 Low — cosmetic, nice to have type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants