Bug Description
The gateway MEDIA: extraction path can treat a MEDIA:/... string embedded inside serialized tool/search-result text as a real outbound attachment directive. In a Telegram gateway turn, this can cause an unrelated/stale image from prior session-search context to be sent as a native photo even though the final assistant reply did not intentionally attach anything.
This is distinct from existing MEDIA delivery issues such as:
Here the problem is the opposite direction: an internal/quoted MEDIA: occurrence can be over-extracted and delivered.
Observed Behavior
In a Telegram gateway session, after a long diagnostic turn involving session_search, the final visible answer contained no MEDIA: tag and no markdown image. However, the gateway still sent one photo attachment immediately after the answer.
Sanitized log shape:
response ready: platform=telegram ... response=1278 chars
Suppressing normal final send ... final delivery already confirmed (streamed=True ... content_delivered=True)
Skipping unsafe MEDIA directive path outside allowed roots
Skipping unsafe MEDIA directive path outside allowed roots
Skipping unsafe MEDIA directive path outside allowed roots
[Telegram] Sending media group of 1 photo(s) (chunk 1/1)
The stored final assistant message contained no MEDIA: and no markdown image:
MEDIA occurrences []
markdown images []
But one of the tool results in the same turn contained a serialized historical search hit with an old media-delivery line, e.g.:
{"content":"... MEDIA:/Users/example/.hermes/media/generated/old-result.png\\n ..."}
That stale path was then interpreted as an attachment candidate and delivered if it passed media-path validation.
Minimal Reproduction Shape
Current extractor shape is too permissive because it can match MEDIA: inside serialized JSON/text, not just standalone final-response directives.
from gateway.platforms.base import BasePlatformAdapter
content = r'{"content":"previous reply MEDIA:/Users/example/.hermes/media/generated/stale.png\\nnot an attachment"}'
media, cleaned = BasePlatformAdapter.extract_media(content)
print(media)
Actual Behavior
extract_media() treats the embedded text as a real attachment directive and returns a media tuple for stale.png.
Expected Behavior
Only explicit outbound attachment directives should be extracted. A MEDIA: occurrence embedded inside JSON/tool results/quoted historical session text should remain plain text and should not trigger native upload.
At minimum, MEDIA: should require a safe directive boundary (for example beginning of line or whitespace boundary) rather than matching in arbitrary serialized payloads. Ideally, media extraction should run only against the final user-visible response text, not against tool result payloads or streamed/internal transcript material.
Impact
- Telegram users can receive unrelated stale images/files from prior search results.
- The assistant's final text can be correct while gateway media side-effects are wrong.
- This is surprising and potentially sensitive if a stale
MEDIA: path points at a deliverable local artifact.
Local Mitigation Tested
A local guard was tested by requiring a non-nonspace left boundary before MEDIA: and adding a regression test like:
def test_media_tag_ignores_json_escaped_tool_result_text():
content = r'{"content":"previous reply MEDIA:/Users/example/.hermes/media/generated/stale.png\\nnot an attachment"}'
media, cleaned = BasePlatformAdapter.extract_media(content)
assert media == []
assert "MEDIA:/Users/example/.hermes/media/generated/stale.png" in cleaned
Focused test run:
pytest tests/gateway/test_platform_base.py::TestExtractMedia -q -n 0 --tb=short
15 passed
Bug Description
The gateway
MEDIA:extraction path can treat aMEDIA:/...string embedded inside serialized tool/search-result text as a real outbound attachment directive. In a Telegram gateway turn, this can cause an unrelated/stale image from prior session-search context to be sent as a native photo even though the final assistant reply did not intentionally attach anything.This is distinct from existing MEDIA delivery issues such as:
MEDIA:directives not delivering correctlyMEDIA:extractionsend_messagevalidation pathHere the problem is the opposite direction: an internal/quoted
MEDIA:occurrence can be over-extracted and delivered.Observed Behavior
In a Telegram gateway session, after a long diagnostic turn involving
session_search, the final visible answer contained noMEDIA:tag and no markdown image. However, the gateway still sent one photo attachment immediately after the answer.Sanitized log shape:
The stored final assistant message contained no
MEDIA:and no markdown image:But one of the tool results in the same turn contained a serialized historical search hit with an old media-delivery line, e.g.:
{"content":"... MEDIA:/Users/example/.hermes/media/generated/old-result.png\\n ..."}That stale path was then interpreted as an attachment candidate and delivered if it passed media-path validation.
Minimal Reproduction Shape
Current extractor shape is too permissive because it can match
MEDIA:inside serialized JSON/text, not just standalone final-response directives.Actual Behavior
extract_media()treats the embedded text as a real attachment directive and returns a media tuple forstale.png.Expected Behavior
Only explicit outbound attachment directives should be extracted. A
MEDIA:occurrence embedded inside JSON/tool results/quoted historical session text should remain plain text and should not trigger native upload.At minimum,
MEDIA:should require a safe directive boundary (for example beginning of line or whitespace boundary) rather than matching in arbitrary serialized payloads. Ideally, media extraction should run only against the final user-visible response text, not against tool result payloads or streamed/internal transcript material.Impact
MEDIA:path points at a deliverable local artifact.Local Mitigation Tested
A local guard was tested by requiring a non-nonspace left boundary before
MEDIA:and adding a regression test like:Focused test run: