fix(gateway): classify document attachments as DOCUMENT on Signal, Email, and SimpleX#44695
Merged
Conversation
…fallback Widen the salvaged #12851 fix to match the established classification pattern (WhatsApp/Slack/BlueBubbles/Mattermost): video/* -> VIDEO, and any remaining MIME type falls through to DOCUMENT instead of TEXT, so exotic types still trigger run.py's document-context injection.
Email cached document attachments and placed them in media_urls, but msg_type only flipped on image attachments — documents stayed TEXT and run.py's document-context injection (gated on MessageType.DOCUMENT) silently dropped them. Same bug class as Signal #12845. DOCUMENT wins over PHOTO for mixed attachments since image handling keys off per-path mime types while document injection gates strictly on message_type.
SimpleX tagged unknown files application/octet-stream in media_types but classification only handled audio/image, leaving msg_type TEXT — run.py never injected the document context. Same bug class as #12845.
Contributor
|
✅ Verified — document attachment classification on Signal, Email, and SimpleX Reviewed the diff for
The fix is correct and consistent. The catch-all |
This was referenced Jun 12, 2026
teknium1
added a commit
that referenced
this pull request
Jun 12, 2026
…CUMENT (#44778) The Teams adapter only handled image/* attachments — documents (the application/vnd.microsoft.teams.file.download.info consent-free download payload and any direct-URL non-image attachment) never reached media_urls at all, so run.py's document-context injection had nothing to surface. Completes the class-wide sweep from PR #44695 (Signal/Email/SimpleX). - download.info attachments: fetch the pre-authed SharePoint downloadUrl (SSRF-guarded, same guard chain as base.py cache_*_from_url) and route through cache_media_bytes - direct-URL non-image attachments: same fetch + classify path - skip Teams' text/html message-body mirror and adaptive-card attachments - DOCUMENT > PHOTO > VIDEO > AUDIO precedence for mixed attachments, matching the Email precedence rationale from #44695
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Inbound document attachments now classify as
MessageType.DOCUMENTon Signal, Email, and SimpleX — previously they were cached intomedia_urlsbut left as TEXT, so run.py's document-context injection (gated strictly on DOCUMENT) silently dropped the file and the agent never saw it. Fixes #12845.Salvages PR #12851 by @kdunn926 (authorship preserved via cherry-pick) and widens the fix to the whole bug class per the contribution rubric.
Changes
gateway/platforms/signal.py: application/* and text/* -> DOCUMENT (@kdunn926's fix), widened to video/* -> VIDEO and catch-all -> DOCUMENT, matching the WhatsApp/Slack/BlueBubbles/Mattermost patterngateway/platforms/email.py: document attachments -> DOCUMENT (was image-only classification); DOCUMENT wins for mixed image+doc emails since image handling keys off per-path mime types while doc injection gates on message_typeplugins/platforms/simplex/adapter.py: non-image/non-audio files -> DOCUMENT (was audio/image only)Cross-platform audit
Validation
scripts/run_tests.sh tests/gateway/test_signal.py tests/gateway/test_email.py tests/gateway/test_simplex_plugin.py— 200 passed, 0 failed.Infographic