Skip to content

fix(gateway): diagnosable MEDIA rejections + canonical cache roots + null-path guard#34485

Merged
teknium1 merged 1 commit into
mainfrom
hermes/hermes-7339404b
May 29, 2026
Merged

fix(gateway): diagnosable MEDIA rejections + canonical cache roots + null-path guard#34485
teknium1 merged 1 commit into
mainfrom
hermes/hermes-7339404b

Conversation

@teknium1

@teknium1 teknium1 commented May 29, 2026

Copy link
Copy Markdown
Contributor

Summary

Salvages the three real fixes from #33251 in 35 LOC of production change. After this, operators can see which MEDIA path was dropped, generated artifacts under the canonical ~/.hermes/cache/{images,...} layout deliver, and a crafted ~\x00 path no longer aborts the whole attachment batch.

Changes

  • gateway/platforms/base.py:
    • MEDIA_DELIVERY_SAFE_ROOTS: add canonical cache/{images,audio,videos,documents,screenshots} alongside the legacy *_cache dirs (Generated images under cache/images may not deliver via Telegram MEDIA, while image_cache works #31733).
    • filter_media_delivery_paths / filter_local_delivery_paths: log the rejected path (was a blind "outside allowed roots") via _log_safe_path, which strips control chars + Unicode line separators (NEL/LS/PS) so a model-emitted path can't forge a second log line.
    • validate_media_delivery_path + extract_media: guard os.path.expanduser so a ~\x00 path returns None / is skipped instead of raising ValueError("embedded null byte") and dropping every other attachment.

Why slimmed from #33251

@GodsBoy's PR (780 LOC net) was correct but gold-plated. Dropped here:

  • 11-constant _MEDIA_REASON_* taxonomy + _validate_..._with_reason split — the logged path is the diagnostic; the reason vocabulary was for a deferred media_rejected[] return field that doesn't exist yet.
  • _redact_path_for_log parts-elision engine → one-line _log_safe_path.
  • _FINAL_RESPONSE_*_EXTS module hoist — unrelated test-convenience refactor.

Also closes a gap the original version missed: the extract_media ingestion site (upstream of the filter loops) was unguarded.

Validation

Before After
Rejected path in log blind "outside allowed roots" path + line-break hardening
~\x00 in a batch aborts whole batch offending item skipped, rest deliver
Canonical cache/images dropped in strict mode deliverable

scripts/run_tests.sh tests/gateway/test_platform_base.py → 116 passed, 0 failed (5 new behavioral tests, no reason-string snapshots).

Closes #33251. Original work and all three findings by @GodsBoy (Dewaldt Huysamen). The merge commit's Co-authored-by trailer was incorrectly written as @wysie during salvage — that was our error; full credit belongs to @GodsBoy.

Infographic

diagnosable-media-delivery

…null-path guard

Operators can now see which MEDIA path was dropped and why, generated
artifacts under the canonical ~/.hermes/cache/{images,...} layout deliver,
and a crafted ~\x00 path no longer aborts the whole attachment batch.

- MEDIA_DELIVERY_SAFE_ROOTS: add canonical cache/{images,audio,videos,
  documents,screenshots} alongside the legacy *_cache dirs (#31733).
- filter_media/local_delivery_paths: log the rejected path (was a blind
  "outside allowed roots") via _log_safe_path, which strips control chars
  and Unicode line separators so a model-emitted path can't forge a log line.
- validate_media_delivery_path + extract_media: guard os.path.expanduser
  so a ~\x00 path returns None / is skipped instead of raising and dropping
  every other attachment in the response.

Salvaged and slimmed from #33251 (780 LOC -> 35): the reason-tag taxonomy,
the parts-eliding redactor, and the extension-partition hoist are dropped in
favor of logging the path directly. All three findings were verified and
reproduced by the contributor.

Co-authored-by: wysie <wysie@users.noreply.github.com>
@teknium1 teknium1 merged commit e28a668 into main May 29, 2026
21 of 24 checks passed
@teknium1 teknium1 deleted the hermes/hermes-7339404b branch May 29, 2026 08:23
Comment thread gateway/platforms/base.py
safe_media.append((safe_path, bool(is_voice)))
else:
logger.warning("Skipping unsafe MEDIA directive path outside allowed roots")
logger.warning("Skipping unsafe MEDIA directive path: %s", _log_safe_path(raw))
Comment thread gateway/platforms/base.py
safe_paths.append(safe_path)
else:
logger.warning("Skipping unsafe local file path outside allowed roots")
logger.warning("Skipping unsafe local file path: %s", _log_safe_path(raw))
@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/gateway Gateway runner, session dispatch, delivery labels May 29, 2026
@GodsBoy

GodsBoy commented May 29, 2026

Copy link
Copy Markdown
Contributor

Small attribution correction: #33251 was authored by me as @GodsBoy, not @wysie.

Thanks for salvaging and slimming the fix. Just noting it here because the PR body and squash commit currently credit @wysie, while the original PR and commits were from @GodsBoy dhuysamen@gmail.com.

@teknium1

Copy link
Copy Markdown
Contributor Author

@GodsBoy you're right, and thanks for flagging it. The original work in #33251 — all three findings — is yours (Dewaldt Huysamen, dhuysamen@gmail.com). The Co-authored-by: wysie trailer on the salvage commit was our mistake during the salvage; there was no @wysie involvement. I've corrected the PR description above to credit you. The merge commit's trailer is already baked into history and isn't worth a force-push to rewrite, but the public record here now reflects that this was your contribution. Appreciate the fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants