fix(gateway): derive MEDIA attachment extensions#32472
Conversation
|
Addressed @alt-glitch’s point in |
|
Confirmed on current main. I reproduced this both with a minimal code check and in a live gateway profile. So this isn't just a theoretical regex mismatch — it becomes a real user-facing delivery failure where the assistant says a markdown file was attached but the platform only receives text. I also hit this with the file already placed under a gateway-allowed document cache dir, so safe-root validation was passing. The failure was specifically the The derivation approach in this PR looks like the right fix. +1 for getting this merged. |
…sResearch#31733) Builds on the direction in NousResearch#31764 with a broader cut at the same bug surface that surfaces in operator reports as "MEDIA tag did not attach": * MEDIA_DELIVERY_SAFE_ROOTS now lists the canonical ~/.hermes/cache/{images,audio,videos,documents,screenshots} explicitly alongside the legacy ~/.hermes/{type}_cache/ entries. get_hermes_dir returns the legacy path when it exists, so IMAGE_CACHE_DIR alone drops the canonical layout on co-existing installs (root cause of NousResearch#31733). * validate_media_delivery_path now delegates to a private _validate_media_delivery_path_with_reason helper that returns (resolved_or_None, kebab_reason). filter_media_delivery_paths and filter_local_delivery_paths log the reason plus a redacted path (_redact_path_for_log), so operators chasing rejected MEDIA tags can tell allowlist-miss from stale-mtime from denied-prefix from does-not-resolve without rebuilding the bot with debug logging. * _redact_path_for_log neutralises control characters (newline, NUL, carriage return) so a model-emitted path containing an embedded newline cannot forge a second log line. * Hoisted _FINAL_RESPONSE_IMAGE_EXTS / _FINAL_RESPONSE_VIDEO_EXTS to module scope so the regression test imports the same partition the final-response dispatch actually uses, instead of snapshotting a local copy that could drift. * Regression coverage added in tests/gateway/test_platform_base.py: - default safe-roots tuple contains both legacy and canonical layouts - legacy and canonical image caches co-exist (the strengthened failure-shape test NousResearch#31764 review asked for) - PDF deliverable from canonical ~/.hermes/cache/documents and legacy ~/.hermes/document_cache - filter_*_delivery_paths logs the redacted path and the kebab reason tag (using the imported _MEDIA_REASON_* constants, not raw strings) for stale-mtime, no-recency, does-not-resolve, denied-prefix, and newline-injection cases - validate_media_delivery_path delegation invariant: public API matches the resolved path returned by the reason-bearing helper - send_message tool and final-response paths share extract_media + filter_media_delivery_paths (asserted via source introspection so a future split surfaces here) - PDF MEDIA tag routes to send_document via the production extension partition (no image / video misroute) Scope: gateway media delivery only. Does not duplicate NousResearch#32472's extension-derivation refactor or NousResearch#31561 / NousResearch#33206's markdown attachment recognition work; if any of those land first, the diagnosable-logging contract here still applies and the canonical-root entries become a clean merge against NousResearch#31764. Deferred follow-up (out of scope for this PR): - Surfacing the rejection reason to the calling agent (send_message tool currently returns {success: true} regardless of dropped attachments) - Unifying the three dispatch-site extension sets across _process_message_background, _deliver_media_from_response, and the send_message tool path - Pre-existing residual: symlink-swap of a cache root before image_generate writes; TOCTOU between resolve() and upload; /var/tmp / /var/cache not in denylist
|
Superseded by #34844, which consolidates this cluster. This PR widens the Closing as superseded — thanks for surfacing and helping pin down this bug; it was part of getting the full fix right. See #34844. |
Summary
MEDIA:attachment extension matching fromSUPPORTED_DOCUMENT_TYPESplus known media/mobile/archive formats instead of maintaining a stale hardcoded regexgateway.auto_attach_local_paths/attachments [on|off|status](/attach) so each gateway bot/profile can toggle opportunistic local path attachment behavior independentlySUPPORTED_DOCUMENT_TYPESTest Plan
python -m pytest tests/gateway/test_extract_local_files.py tests/gateway/test_platform_base.py tests/gateway/test_send_image_file.py tests/gateway/test_tts_media_routing.py tests/gateway/test_send_multiple_images.py tests/gateway/test_attachments_command.py tests/cron/test_scheduler.py tests/hermes_cli/test_kanban_notify.py tests/e2e/test_platform_commands.py tests/gateway/test_session_race_guard.py -qFixes #31560