fix(telegram): treat image documents (PNG/JPG/WebP/GIF) as images by stabruriss · Pull Request #16710 · NousResearch/hermes-agent

stabruriss · 2026-04-27T21:03:06Z

Summary

Telegram messages where the user uploads a PNG/JPG/WebP/GIF as a file (instead of as a photo) arrive as msg.document, not msg.photo. The document handler currently rejects them with Unsupported document type '.png' because the image extensions aren't in SUPPORTED_DOCUMENT_TYPES. The vision tool never sees the image, so the agent replies that it can't see the file.
Route image-typed documents through the same image-cache path used for native photos: download → cache_image_from_bytes → MessageType.PHOTO → _queue_media_group_event / _enqueue_photo_event. This means albums and bursts of mixed photo/image-document messages are also buffered correctly.
Re-implements the idea from previously-closed PR fix: treat Telegram image documents as images #13215.

Changes

gateway/platforms/base.py: add SUPPORTED_IMAGE_TYPES (.png/.jpg/.jpeg/.webp/.gif).
gateway/platforms/telegram.py: in the msg.document branch, after video-document handling, route image documents through the photo pipeline. MIME-type fallback when filename is missing. 20 MB size cap consistent with other document types.
tests/gateway/test_telegram_documents.py: 6 new tests — PNG, JPEG, WebP, MIME-only resolution, oversized rejection, and a mixed-album case where a native photo and a PNG-as-document get buffered into one event.

Test plan

`pytest tests/gateway/test_telegram_documents.py -q` — 42 passed (36 pre-existing + 6 new)
`python -m py_compile gateway/platforms/base.py gateway/platforms/telegram.py tests/gateway/test_telegram_documents.py`
Manual: send a PNG via Telegram with "send as file" checked; confirm the agent describes the image instead of replying "unsupported"

The upstream Dockerfile sets USER hermes before ENTRYPOINT, which prevents docker/entrypoint.sh from running its gosu-based chown block when Railway mounts a root-owned Volume at /opt/data. Starting as root lets the entrypoint fix ownership, then drop privileges via gosu.

The Telegram/Discord messaging gateway reads model.default only from config.yaml (see gateway/run.py _resolve_gateway_model), so the HERMES_MODEL env var has no effect on those gateways. Patch the entrypoint to rewrite config.yaml's model.default from HERMES_MODEL when set, making the env var the single source of truth in container deployments.

PyYAML writes scalar strings without quotes by default, so the previous regex (which required double-quoted values) silently no-op'd on real config.yaml files. Switch to a yaml.safe_load/safe_dump round trip — robust to either quoting style and to wholly missing model sections.

When a user uploaded a PNG via Telegram with "send as file" (so it arrives as msg.document, not msg.photo), the document handler rejected it with "Unsupported document type '.png'" because PNG is not in SUPPORTED_DOCUMENT_TYPES. The vision tool never saw the image. Route image-typed documents through the same image cache path used for native photos so the agent can actually see them. Album and burst buffering work the same as native photos. Re-implements the idea from closed PR NousResearch#13215. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

laydros · 2026-05-02T03:24:11Z

I hit this too and filed #18620 with repro details. This looks like the right PR for the fix.

teknium1 · 2026-06-10T09:26:16Z

This is already implemented on current main by the later Telegram image-document salvage work. Thanks for the report and patch; the prior discussion was useful, including @laydros confirming the same repro in #18620.

Automated hermes-sweeper review evidence:

gateway/platforms/base.py:1150 defines SUPPORTED_IMAGE_DOCUMENT_TYPES for .jpg, .jpeg, .png, .webp, and .gif.
gateway/platforms/telegram.py:5562 detects image documents by extension or image/* MIME, downloads the bytes, caches via cache_image_from_bytes, and sets MessageType.PHOTO.
gateway/platforms/telegram.py:5582 routes those image-document events through _queue_media_group_event / _enqueue_photo_event, matching the native photo batching path requested here.
tests/gateway/test_telegram_documents.py:264 covers a PNG sent as a Telegram document and asserts it is enqueued as a photo instead of going through the unsupported-document path.
The fix is present in commit 77c4675a50db7abbfd191d4fba4746b4f3e1559e (fix(telegram): route image documents (.png/.jpg/.webp/.gif) through vision pipeline), and the superseding salvage PR fix(telegram): route image documents (.png/.jpg/.webp/.gif) through vision pipeline #28519 was merged.

stabruriss and others added 5 commits April 20, 2026 14:59

Remove VOLUME directive for Railway compatibility

4c062b5

alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/gateway Gateway runner, session dispatch, delivery platform/telegram Telegram bot adapter tool/vision Vision analysis and image generation labels Apr 27, 2026

This was referenced May 1, 2026

fix(gateway): add image types to SUPPORTED_DOCUMENT_TYPES #18142

Closed

[Bug]: Telegram incoming image files treated as document uploads #18620

Closed

kiranvk-2011 mentioned this pull request May 13, 2026

fix(telegram): route image documents (.png/.jpg/.webp/.gif) through vision pipeline #24815

Closed

banditburai mentioned this pull request May 30, 2026

fix(vision): unify vision_analyze image-source resolution through one resolver #35362

Open

teknium1 closed this Jun 10, 2026

teknium1 added the sweeper:implemented-on-main Sweeper: behavior already present on current main label Jun 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(telegram): treat image documents (PNG/JPG/WebP/GIF) as images#16710

fix(telegram): treat image documents (PNG/JPG/WebP/GIF) as images#16710
stabruriss wants to merge 5 commits into
NousResearch:mainfrom
stabruriss:stabruriss/tg-png-support

stabruriss commented Apr 27, 2026

Uh oh!

laydros commented May 2, 2026

Uh oh!

teknium1 commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

stabruriss commented Apr 27, 2026

Summary

Changes

Test plan

Uh oh!

laydros commented May 2, 2026

Uh oh!

teknium1 commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants