feat(tui_gateway): add image.attach_bytes and pdf.attach methods#21908
Closed
ccook1963 wants to merge 1 commit into
Closed
feat(tui_gateway): add image.attach_bytes and pdf.attach methods#21908ccook1963 wants to merge 1 commit into
ccook1963 wants to merge 1 commit into
Conversation
Adds two JSON-RPC methods so clients (e.g. dashboards) can attach images and PDFs to a session in a single round-trip, without first transferring the file onto the host filesystem via SCP/SFTP. image.attach_bytes: Accepts base64-encoded image bytes (with optional data: URL prefix or filename hint), writes to ~/.hermes/images/, queues into the session's attached_images list. 25 MB cap, magic-byte sniffing for PNG/JPEG/GIF/ WebP/BMP, defense-in-depth extension allowlist. Same response shape as image.attach. pdf.attach: Accepts an on-disk PDF path or base64 PDF bytes, runs pdftoppm at 150 DPI to render each page to PNG, queues each page-image into attached_images. Anthropic's vision pipeline accepts images, not PDFs, so this fills the gap for clients that want 'drop a PDF into a chat' semantics. 50 MB PDF cap, 25 pages per call cap. Validates %PDF- magic bytes for base64 input. Both methods reuse the existing _image_meta(), _IMAGE_EXTENSIONS allow- list, _hermes_home paths, and image_counter logic — so existing prompt. submit native-attach handling picks them up unchanged. Tests: 7 new pytest cases in tests/test_tui_gateway_server.py covering empty input, invalid base64, oversized payload, magic-byte sniffing, data URL prefix, on-disk PDF rendering, non-PDF rejection, page-range cap. pdf.attach tests skip cleanly when pdftoppm is not installed.
teknium1
added a commit
that referenced
this pull request
Jun 7, 2026
…splay gateway images over the network
Desktop connected to a remote gateway can now attach images and PDFs and
display agent-written images. Previously the desktop passed a LOCAL file path
to image.attach; on a remote gateway that path doesn't exist, so the image was
silently dropped ("skipped unreadable path") and the vision model never saw it.
The reverse direction was also broken — images the agent wrote on the gateway
rendered as dead links in the remote client.
Gateway (tui_gateway/server.py):
- image.attach_bytes: base64 byte upload written into the gateway's own images
dir and queued via the existing native-image-attach pipeline. Magic-byte
extension sniffing, data-URL prefix + whitespace tolerance, 25 MB cap,
structured error codes. Accepts content_base64/filename (canonical) and
data/ext (older-desktop aliases).
- pdf.attach: renders each page to PNG via pdftoppm (poppler-utils) at 150 DPI
and queues the pages as images; 50 MB / 25-page caps. Accepts host path or
base64 upload.
- Shared helpers (_decode_attach_base64, _sniff_image_ext, _queue_attached_image)
so the two methods and the existing image.attach don't duplicate logic.
Gateway (hermes_cli/web_server.py):
- GET /api/media: returns a gateway-local image as a base64 data URL so remote
clients can display it. Auth-gated like every /api route, extension
allowlist + size cap, AND confined to the gateway's own media roots
(images/screenshots/cache, resolved symlink-safe) so an authed caller can't
read image-extension files anywhere on disk.
Desktop (apps/desktop):
- syncImageAttachmentsForSubmit uploads bytes via image.attach_bytes when the
connection mode is 'remote'; the local fast path is unchanged.
- media.ts gains isRemoteGateway() + gatewayMediaDataUrl(); directive-text and
markdown-text fetch images over /api/media in remote mode.
Consolidates the competing remote-media PRs (#38876, #40317, #21908, #39437)
into one coherent implementation, taking the strongest parts of each and adding
shared-helper cleanup plus the /api/media root-confinement hardening on top.
The per-profile gateway switching from #38876 is intentionally left out as a
separable feature. TUI file uploads (#40492) remain a separate surface.
Tested: 11 new tui_gateway tests + 5 /api/media endpoint tests + desktop
media.remote unit tests; full tui_gateway + web_server suites green (472
passed); tsc -b clean; E2E verified the full attach→disk→queue and
gateway-path→data-URL display round-trip plus the out-of-root security block.
Co-authored-by: Max Mitcham <maxmitcham@mac.home>
Co-authored-by: Justlrnal4 <Justlrnal4@users.noreply.github.com>
Co-authored-by: Chris Cook <ccook@nvms.com>
Co-authored-by: Thomas Paquette <thomas.paquette@gmail.com>
Contributor
|
Merged via #41336 (commit 16786f3 on The remote-media work from this cluster was consolidated into one coherent implementation — Verified live end-to-end over the real dashboard stack (real |
changman
pushed a commit
to changman/hermes-agent
that referenced
this pull request
Jun 10, 2026
…splay gateway images over the network
Desktop connected to a remote gateway can now attach images and PDFs and
display agent-written images. Previously the desktop passed a LOCAL file path
to image.attach; on a remote gateway that path doesn't exist, so the image was
silently dropped ("skipped unreadable path") and the vision model never saw it.
The reverse direction was also broken — images the agent wrote on the gateway
rendered as dead links in the remote client.
Gateway (tui_gateway/server.py):
- image.attach_bytes: base64 byte upload written into the gateway's own images
dir and queued via the existing native-image-attach pipeline. Magic-byte
extension sniffing, data-URL prefix + whitespace tolerance, 25 MB cap,
structured error codes. Accepts content_base64/filename (canonical) and
data/ext (older-desktop aliases).
- pdf.attach: renders each page to PNG via pdftoppm (poppler-utils) at 150 DPI
and queues the pages as images; 50 MB / 25-page caps. Accepts host path or
base64 upload.
- Shared helpers (_decode_attach_base64, _sniff_image_ext, _queue_attached_image)
so the two methods and the existing image.attach don't duplicate logic.
Gateway (hermes_cli/web_server.py):
- GET /api/media: returns a gateway-local image as a base64 data URL so remote
clients can display it. Auth-gated like every /api route, extension
allowlist + size cap, AND confined to the gateway's own media roots
(images/screenshots/cache, resolved symlink-safe) so an authed caller can't
read image-extension files anywhere on disk.
Desktop (apps/desktop):
- syncImageAttachmentsForSubmit uploads bytes via image.attach_bytes when the
connection mode is 'remote'; the local fast path is unchanged.
- media.ts gains isRemoteGateway() + gatewayMediaDataUrl(); directive-text and
markdown-text fetch images over /api/media in remote mode.
Consolidates the competing remote-media PRs (NousResearch#38876, NousResearch#40317, NousResearch#21908, NousResearch#39437)
into one coherent implementation, taking the strongest parts of each and adding
shared-helper cleanup plus the /api/media root-confinement hardening on top.
The per-profile gateway switching from NousResearch#38876 is intentionally left out as a
separable feature. TUI file uploads (NousResearch#40492) remain a separate surface.
Tested: 11 new tui_gateway tests + 5 /api/media endpoint tests + desktop
media.remote unit tests; full tui_gateway + web_server suites green (472
passed); tsc -b clean; E2E verified the full attach→disk→queue and
gateway-path→data-URL display round-trip plus the out-of-root security block.
Co-authored-by: Max Mitcham <maxmitcham@mac.home>
Co-authored-by: Justlrnal4 <Justlrnal4@users.noreply.github.com>
Co-authored-by: Chris Cook <ccook@nvms.com>
Co-authored-by: Thomas Paquette <thomas.paquette@gmail.com>
alt-glitch
pushed a commit
that referenced
this pull request
Jun 14, 2026
…splay gateway images over the network
Desktop connected to a remote gateway can now attach images and PDFs and
display agent-written images. Previously the desktop passed a LOCAL file path
to image.attach; on a remote gateway that path doesn't exist, so the image was
silently dropped ("skipped unreadable path") and the vision model never saw it.
The reverse direction was also broken — images the agent wrote on the gateway
rendered as dead links in the remote client.
Gateway (tui_gateway/server.py):
- image.attach_bytes: base64 byte upload written into the gateway's own images
dir and queued via the existing native-image-attach pipeline. Magic-byte
extension sniffing, data-URL prefix + whitespace tolerance, 25 MB cap,
structured error codes. Accepts content_base64/filename (canonical) and
data/ext (older-desktop aliases).
- pdf.attach: renders each page to PNG via pdftoppm (poppler-utils) at 150 DPI
and queues the pages as images; 50 MB / 25-page caps. Accepts host path or
base64 upload.
- Shared helpers (_decode_attach_base64, _sniff_image_ext, _queue_attached_image)
so the two methods and the existing image.attach don't duplicate logic.
Gateway (hermes_cli/web_server.py):
- GET /api/media: returns a gateway-local image as a base64 data URL so remote
clients can display it. Auth-gated like every /api route, extension
allowlist + size cap, AND confined to the gateway's own media roots
(images/screenshots/cache, resolved symlink-safe) so an authed caller can't
read image-extension files anywhere on disk.
Desktop (apps/desktop):
- syncImageAttachmentsForSubmit uploads bytes via image.attach_bytes when the
connection mode is 'remote'; the local fast path is unchanged.
- media.ts gains isRemoteGateway() + gatewayMediaDataUrl(); directive-text and
markdown-text fetch images over /api/media in remote mode.
Consolidates the competing remote-media PRs (#38876, #40317, #21908, #39437)
into one coherent implementation, taking the strongest parts of each and adding
shared-helper cleanup plus the /api/media root-confinement hardening on top.
The per-profile gateway switching from #38876 is intentionally left out as a
separable feature. TUI file uploads (#40492) remain a separate surface.
Tested: 11 new tui_gateway tests + 5 /api/media endpoint tests + desktop
media.remote unit tests; full tui_gateway + web_server suites green (472
passed); tsc -b clean; E2E verified the full attach→disk→queue and
gateway-path→data-URL display round-trip plus the out-of-root security block.
Co-authored-by: Max Mitcham <maxmitcham@mac.home>
Co-authored-by: Justlrnal4 <Justlrnal4@users.noreply.github.com>
Co-authored-by: Chris Cook <ccook@nvms.com>
Co-authored-by: Thomas Paquette <thomas.paquette@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds two JSON-RPC methods to the TUI gateway so clients (web dashboards, mobile apps, etc.) can attach images and PDFs to a session in a single round-trip without first transferring the file onto the host filesystem via SCP/SFTP.
image.attach_bytes— base64 image uploadpdf.attach— PDF → per-page PNGs viapdftoppm, queued as image attachmentsBoth methods reuse the existing
_image_meta(),_IMAGE_EXTENSIONSallowlist,_hermes_homepaths, andimage_counterlogic. The nextprompt.submitpicks the queued attachments up via the existing native-image-attach pipeline — no changes to attachment delivery downstream.Why
Today, the only way for a remote client to attach an image to a Hermes session is
image.attach, which expects a path on the host filesystem. That requires SCP/SFTP first. For a web dashboard the round-trip is awkward — every user upload becomesPOST → server-side SCP → JSON-RPC image.attach. Hermie's dashboard hit this and we ended up writing a local patch.PDFs have a related gap: Anthropic's vision pipeline accepts images, not PDFs. A user can't drop a PDF into chat and have Claude read it without page-rendering server-side.
pdf.attachdoes that withpdftoppm(poppler-utils), 150 DPI per page, which is readable for vision without being absurdly large.Method shapes
image.attach_bytes{ "method": "image.attach_bytes", "params": { "session_id": "<sid>", "content_base64": "iVBORw0KGgo...", // raw OR data:image/png;base64,... "filename": "screenshot.png" // optional, drives extension } } // → { attached, path, count, remainder, text, bytes, name, width, height, token_estimate }data:image/...;base64,prefix and embedded whitespace~/.hermes/images/)pdf.attach{ "method": "pdf.attach", "params": { "session_id": "<sid>", "path": "/path/to.pdf", // OR "content_base64": "JVBERi0xLjQK...", // OR data:application/pdf;base64,... "filename": "report.pdf", // optional, for display "first_page": 1, // optional "last_page": 10 // optional, default = first + 24 } } // → { attached, filename, pages_attached, pages: [{path, page, name, width, height, token_estimate}, ...], count, text }%PDF-magic bytes for base64 input5028error ifpdftoppmis not on PATHattached_imagessoprompt.submitpicks them upTests
7 new pytest cases in
tests/test_tui_gateway_server.py:test_image_attach_bytes_accepts_raw_base64test_image_attach_bytes_strips_data_url_prefixtest_image_attach_bytes_rejects_invalid_base64(4017)test_image_attach_bytes_rejects_oversized(4018)test_pdf_attach_renders_pages_when_pdftoppm_available(skips if poppler not installed)test_pdf_attach_rejects_non_pdf_payloadtest_pdf_attach_rejects_oversized_page_range(4019)All 9
image_attach/pdf_attachtests pass on Python 3.12 with poppler-utils 24.02.0:Out-of-band testing
This patch has been running on a 5-VPS production fleet for ~1 hour as of this PR. No regressions observed.
Dependencies
pdf.attachrequirespdftoppm(apt install poppler-utilson Debian/Ubuntu,brew install poppleron macOS)base64,re,subprocess,tempfile,shutil)AGENTS.mdslash-command registry, gateway hooks, or platform adaptersError codes
Reuses existing convention from
image.attachandclipboard.paste. New codes:4017— invalid base64 / not a PDF4018— payload exceeds size cap4019— page range exceeds cap5028—pdftoppmnot installed / failed / timed out