Skip to content

feat(desktop+gateway): remote media relay — attach images/PDFs & display gateway images over the network#41203

Closed
teknium1 wants to merge 1 commit into
mainfrom
hermes/hermes-e2178603
Closed

feat(desktop+gateway): remote media relay — attach images/PDFs & display gateway images over the network#41203
teknium1 wants to merge 1 commit into
mainfrom
hermes/hermes-e2178603

Conversation

@teknium1

@teknium1 teknium1 commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

Summary

Desktop connected to a remote gateway can now attach images and PDFs and display agent-written images — the whole remote-media round-trip works for the first time.

Previously the desktop passed a local file path to image.attach. On a remote gateway (VPS) that path doesn't exist, so the upload was silently dropped ("skipped unreadable path") and the vision model never saw the image. The reverse direction was broken too: images the agent wrote on the gateway rendered as dead links in the remote client.

This consolidates the competing remote-media PRs (#38876, #40317, #21908, #39437) into one coherent implementation, taking the strongest part of each and adding shared-helper cleanup + a security hardening of /api/media on top.

Changes

Gateway — tui_gateway/server.py

  • image.attach_bytes: base64 byte upload, written into the gateway's own images/ dir and queued via the existing native-image-attach pipeline. Magic-byte extension sniffing, data-URL prefix + whitespace tolerance, 25 MB cap, structured error codes. Accepts content_base64/filename (canonical) and data/ext (older-desktop aliases).
  • pdf.attach: renders each page to PNG via pdftoppm (poppler-utils) at 150 DPI and queues the pages as images; 50 MB / 25-page caps. Accepts host path or base64 upload.
  • Shared helpers (_decode_attach_base64, _sniff_image_ext, _queue_attached_image) so the two new methods and the existing image.attach don't duplicate logic.

Gateway — hermes_cli/web_server.py

  • GET /api/media: returns a gateway-local image as a base64 data URL so remote clients can display it. Auth-gated like every /api route, extension allowlist + size cap, and confined to the gateway's own media roots (images/screenshots/cache, resolved symlink-safe) so an authed caller can't read image-extension files anywhere on disk.

Desktop — apps/desktop

  • syncImageAttachmentsForSubmit uploads bytes via image.attach_bytes when the connection mode is 'remote'; the local fast path is unchanged.
  • media.ts gains isRemoteGateway() + gatewayMediaDataUrl(); directive-text and markdown-text fetch images over /api/media in remote mode.

scripts/release.py — AUTHOR_MAP entries for the salvaged contributors.

What's intentionally out of scope

Validation

Before After
Remote image attach local path → dropped, vision blind bytes uploaded → gateway disk → queued
Remote PDF attach not possible rendered to per-page PNGs, queued
Remote image display dead "Open …png" link fetched via /api/media as data URL
/api/media scope n/a root-confined, ext-allowlisted, size-capped, auth-gated
  • 11 new tui_gateway tests + 5 /api/media endpoint tests + desktop media.remote unit tests.
  • Full tui_gateway + web_server suites green: 472 passed.
  • tsc -b clean on the desktop app.
  • E2E verified the full attach → gateway disk → queue and gateway-path → data-URL display round-trip (real base64, real file I/O), plus the out-of-root security block (403) and invalid-base64 rejection (4017).

Credit

Consolidates the work of @maxtrigify (#38876), @Justlrnal4 (#40317), @ccook1963 (#21908), and @RyTsYdUp (#39437), preserved via Co-authored-by trailers on the commit. Each PR will be closed pointing here.

Infographic

remote-media-relay

…splay gateway images over the network

Desktop connected to a remote gateway can now attach images and PDFs and
display agent-written images. Previously the desktop passed a LOCAL file path
to image.attach; on a remote gateway that path doesn't exist, so the image was
silently dropped ("skipped unreadable path") and the vision model never saw it.
The reverse direction was also broken — images the agent wrote on the gateway
rendered as dead links in the remote client.

Gateway (tui_gateway/server.py):
- image.attach_bytes: base64 byte upload written into the gateway's own images
  dir and queued via the existing native-image-attach pipeline. Magic-byte
  extension sniffing, data-URL prefix + whitespace tolerance, 25 MB cap,
  structured error codes. Accepts content_base64/filename (canonical) and
  data/ext (older-desktop aliases).
- pdf.attach: renders each page to PNG via pdftoppm (poppler-utils) at 150 DPI
  and queues the pages as images; 50 MB / 25-page caps. Accepts host path or
  base64 upload.
- Shared helpers (_decode_attach_base64, _sniff_image_ext, _queue_attached_image)
  so the two methods and the existing image.attach don't duplicate logic.

Gateway (hermes_cli/web_server.py):
- GET /api/media: returns a gateway-local image as a base64 data URL so remote
  clients can display it. Auth-gated like every /api route, extension
  allowlist + size cap, AND confined to the gateway's own media roots
  (images/screenshots/cache, resolved symlink-safe) so an authed caller can't
  read image-extension files anywhere on disk.

Desktop (apps/desktop):
- syncImageAttachmentsForSubmit uploads bytes via image.attach_bytes when the
  connection mode is 'remote'; the local fast path is unchanged.
- media.ts gains isRemoteGateway() + gatewayMediaDataUrl(); directive-text and
  markdown-text fetch images over /api/media in remote mode.

Consolidates the competing remote-media PRs (#38876, #40317, #21908, #39437)
into one coherent implementation, taking the strongest parts of each and adding
shared-helper cleanup plus the /api/media root-confinement hardening on top.
The per-profile gateway switching from #38876 is intentionally left out as a
separable feature. TUI file uploads (#40492) remain a separate surface.

Tested: 11 new tui_gateway tests + 5 /api/media endpoint tests + desktop
media.remote unit tests; full tui_gateway + web_server suites green (472
passed); tsc -b clean; E2E verified the full attach→disk→queue and
gateway-path→data-URL display round-trip plus the out-of-root security block.

Co-authored-by: Max Mitcham <maxmitcham@mac.home>
Co-authored-by: Justlrnal4 <Justlrnal4@users.noreply.github.com>
Co-authored-by: Chris Cook <ccook@nvms.com>
Co-authored-by: Thomas Paquette <thomas.paquette@gmail.com>
@github-actions

github-actions Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

🔎 Lint report: hermes/hermes-e2178603 vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 10017 on HEAD, 10001 on base (🆕 +16)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 5189 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

@alt-glitch alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have comp/gateway Gateway runner, session dispatch, delivery tool/vision Vision analysis and image generation labels Jun 7, 2026
@OutThisLife

Copy link
Copy Markdown
Collaborator

Already landed on main as 16786f3 (same commit title), with teknium's and the four co-authors' authorship preserved via Co-authored-by. The PR stayed open only because it merged as a re-authored squash rather than through the GitHub button, and its branch sits on a months-old main (hence the conflict flag).

Verified the whole round-trip is live on main: image.attach_bytes and pdf.attach plus the shared helpers in tui_gateway/server.py, GET /api/media confined to _media_serve_roots() in hermes_cli/web_server.py, the remote byte-upload in use-prompt-actions.ts, and gateway-image display via gatewayMediaDataUrl/isRemoteGateway in directive-text.tsx and markdown-text.tsx. The companion file.attach path for non-image files (documents, CSVs) is on main too via dbbd1d4. Closing as merged.

@OutThisLife OutThisLife closed this Jun 9, 2026
OutThisLife added a commit that referenced this pull request Jun 10, 2026
Finder/OS drops became `@file:/Users/...` refs that only resolve when the
gateway shares the local disk, so on a remote gateway non-image files
(PDF/CSV/Markdown/...) never reached the agent. Route OS drops through the
file.attach / image.attach_bytes upload pipeline — in-app project-tree and
gutter drags stay inline workspace-relative refs — across every drop surface:
the conversation area, the composer form, the contenteditable input, and the
message-edit composer (which still reproduced the bug).

Also:
- upload dropped files eagerly when a session exists, so the card shows a
  spinner instead of stalling the send (images stay submit-time to avoid
  racing their thumbnail write);
- round the attachment card and drop the monospace detail;
- render image previews from the bytes we already hold, so a pasted/dropped
  screenshot shows its thumbnail and previews even when its only on-disk copy
  is a transient path (the data URL is not persisted to localStorage).

Supersedes #38615, #41203.

Co-authored-by: LeonSGP <154585401+LeonSGP43@users.noreply.github.com>
Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
wachoo pushed a commit to wachoo/hermes-agent that referenced this pull request Jun 10, 2026
Finder/OS drops became `@file:/Users/...` refs that only resolve when the
gateway shares the local disk, so on a remote gateway non-image files
(PDF/CSV/Markdown/...) never reached the agent. Route OS drops through the
file.attach / image.attach_bytes upload pipeline — in-app project-tree and
gutter drags stay inline workspace-relative refs — across every drop surface:
the conversation area, the composer form, the contenteditable input, and the
message-edit composer (which still reproduced the bug).

Also:
- upload dropped files eagerly when a session exists, so the card shows a
  spinner instead of stalling the send (images stay submit-time to avoid
  racing their thumbnail write);
- round the attachment card and drop the monospace detail;
- render image previews from the bytes we already hold, so a pasted/dropped
  screenshot shows its thumbnail and previews even when its only on-disk copy
  is a transient path (the data URL is not persisted to localStorage).

Supersedes NousResearch#38615, NousResearch#41203.

Co-authored-by: LeonSGP <154585401+LeonSGP43@users.noreply.github.com>
Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
changman pushed a commit to changman/hermes-agent that referenced this pull request Jun 10, 2026
Finder/OS drops became `@file:/Users/...` refs that only resolve when the
gateway shares the local disk, so on a remote gateway non-image files
(PDF/CSV/Markdown/...) never reached the agent. Route OS drops through the
file.attach / image.attach_bytes upload pipeline — in-app project-tree and
gutter drags stay inline workspace-relative refs — across every drop surface:
the conversation area, the composer form, the contenteditable input, and the
message-edit composer (which still reproduced the bug).

Also:
- upload dropped files eagerly when a session exists, so the card shows a
  spinner instead of stalling the send (images stay submit-time to avoid
  racing their thumbnail write);
- round the attachment card and drop the monospace detail;
- render image previews from the bytes we already hold, so a pasted/dropped
  screenshot shows its thumbnail and previews even when its only on-disk copy
  is a transient path (the data URL is not persisted to localStorage).

Supersedes NousResearch#38615, NousResearch#41203.

Co-authored-by: LeonSGP <154585401+LeonSGP43@users.noreply.github.com>
Co-authored-by: Teknium <127238744+teknium1@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery P3 Low — cosmetic, nice to have tool/vision Vision analysis and image generation type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants