Problem
extract_media() scans the full response text with a single regex pass
(MEDIA_TAG_CLEANUP_RE) without distinguishing live delivery tags from
example paths mentioned in prose.
This causes false positives when:
- A skill description, error message, or doc string contains a literal
example like inside a code block or quote
- The agent explains how to use the MEDIA tag in its reply (e.g.
"include MEDIA:/path/to/file in your response")
- A tool returns output that happens to contain a path matching the regex
Effect: The matching text is stripped from the user-visible response
and the path is added to the media list. validate_media_delivery_path
then either rejects it (silent drop) or — if the path happens to exist —
delivers an unintended file.
Minimal Reproduction
Ask the agent to explain the MEDIA delivery syntax. Its reply will likely
contain something like:
To send an image, include in your response.
The backtick-wrapped example matches MEDIA_TAG_CLEANUP_RE, gets stripped
from the text, and an attempt is made to deliver /path/to/image.jpg.
Root Cause
extract_media() in gateway/platforms/base.py (~line 2577):
for match in media_pattern.finditer(content):
path = match.group("path").strip()
...
media.append((os.path.expanduser(path), has_voice_tag))
The scan is context-blind — it does not skip:
- Fenced code blocks (
...)
- Inline code spans (
MEDIA:...)
- Blockquotes (
> ...)
- Tool output embedded in the response
Proposed Fix Direction
Before running MEDIA_TAG_CLEANUP_RE, mask content inside fenced code
blocks, inline code spans, and blockquotes (replace protected spans with
equal-length whitespace to preserve match offsets). This keeps the cleanup
substitution correct while eliminating false positives.
Happy to submit a PR if the direction looks right — we have a working
patch in our deployment.
Environment
- macOS 15, launchd-managed gateway
- Feishu platform adapter
- Triggered by: skills that document MEDIA syntax, tool error messages
containing file paths
Problem
extract_media()scans the full response text with a single regex pass(
MEDIA_TAG_CLEANUP_RE) without distinguishing live delivery tags fromexample paths mentioned in prose.
This causes false positives when:
example like inside a code block or quote
"include
MEDIA:/path/to/filein your response")Effect: The matching text is stripped from the user-visible response
and the path is added to the media list.
validate_media_delivery_paththen either rejects it (silent drop) or — if the path happens to exist —
delivers an unintended file.
Minimal Reproduction
Ask the agent to explain the MEDIA delivery syntax. Its reply will likely
contain something like:
Root Cause
extract_media()ingateway/platforms/base.py(~line 2577):for match in media_pattern.finditer(content):
path = match.group("path").strip()
...
media.append((os.path.expanduser(path), has_voice_tag))
The scan is context-blind — it does not skip:
...)MEDIA:...)> ...)Proposed Fix Direction
Before running
MEDIA_TAG_CLEANUP_RE, mask content inside fenced codeblocks, inline code spans, and blockquotes (replace protected spans with
equal-length whitespace to preserve match offsets). This keeps the cleanup
substitution correct while eliminating false positives.
Happy to submit a PR if the direction looks right — we have a working
patch in our deployment.
Environment
containing file paths