Skip to content

fix(gateway): deliver script and config MEDIA attachments#32294

Closed
adam91holt wants to merge 1 commit into
NousResearch:mainfrom
adam91holt:fix/gateway-media-script-attachments
Closed

fix(gateway): deliver script and config MEDIA attachments#32294
adam91holt wants to merge 1 commit into
NousResearch:mainfrom
adam91holt:fix/gateway-media-script-attachments

Conversation

@adam91holt

Copy link
Copy Markdown
Contributor

Summary

  • teach gateway MEDIA tag extraction to recognise script/code/config/document extensions like .sh, .py, .md, .yaml, .toml, and .log
  • add legacy ~/.hermes/media_cache to the safe delivery roots so files agents already write there are uploaded instead of silently skipped
  • add regression tests for both behaviours

Test Plan

  • python -m pytest tests/gateway/test_platform_base.py -q -o 'addopts='
  • live smoke: send_message to Telegram home with MEDIA:/home/adam/.hermes/media_cache/hermes-media-delivery-smoke.txt returned success/message_id

@hclsys hclsys left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extension-list expansion is reasonable (delivering .py/.sh/.md/configs as file attachments is a legit gap). But I'd flag the new safe-root as a security-surface widening worth a second look, because of how the allowlist interacts with the credential denylist.

MEDIA_DELIVERY_SAFE_ROOTS is allowlist-first — the code comments are explicit that "the cache-dir allowlist still beats [the denylist]" (gateway/platforms/base.py:857) and that "the Hermes home itself contains credentials (auth.json, .env) — only the cache subdirectories under it are explicitly allowlisted" (:927-928). So anything under an allowlisted root is deliverable even if it'd otherwise hit the credential deny.

The existing allowlisted roots (image_cache, audio_cache, video_cache) are type-scoped: each has a dedicated writer (cache_image_from_bytes(ext=".jpg"), cache_audio_from_bytes(ext=".ogg"), …) that only ever writes media there. This PR adds media_cache, but I don't see any in-repo writer for it — per the PR description it's a "legacy" dir that "agents already write" to, i.e. an agent-controlled location with no enforced content type.

Combined with the newly-deliverable config-shaped extensions (.json, .yaml, .toml, .ini, .cfg), this means: if an agent ever writes a config/secret-shaped file into media_cache, it's now both (a) under an allowlist root that beats the credential deny, and (b) a deliverable extension — so it could be uploaded to a chat. That's a broader exfil surface than the deliberate "only purpose-scoped media caches are allowlisted" design.

Two options worth considering:

  1. Keep media_cache allowlisted but document/enforce that it's media-only (or route agent media writes through a typed cache_*_from_bytes-style helper like the other caches), so arbitrary configs don't land there.
  2. Or don't add it to the allowlist-that-beats-deny; instead rely on the recency-trust path for it (so the credential denylist still applies), if that satisfies the "agents already write here" use case.

If media_cache is in practice only ever written by media-producing code, this is low-risk — but since there's no in-repo writer pinning that invariant, it's worth making explicit. The extension-list change itself is fine.

@teknium1

Copy link
Copy Markdown
Contributor

Superseded by #34844, which consolidates this cluster.

This PR widens the extract_media extension allowlist, which is the right direction — but on its own it leaves the unconditional MEDIA:\s*\S+ strip in place, so a MEDIA: tag with any extension still outside the (now wider) list keeps getting deleted from the body before extract_local_files can pick up the bare path. #34844 fixes both halves: it unifies the two extractors onto a single shared extension set (MEDIA_DELIVERY_EXTS) AND replaces the loose strip with an extension-anchored one, so an unknown-extension path survives in the text instead of vanishing.

Closing as superseded — thanks for surfacing and helping pin down this bug; it was part of getting the full fix right. See #34844.

@teknium1 teknium1 closed this May 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants