Problem
When Hermes runs in a sandboxed terminal backend (Docker, SSH, Modal, Singularity), the agent creates files inside the sandbox but has no way to send them to the user. Similarly, when users send file attachments on messaging platforms, those files are not injected into the sandbox.
This is the #1 missing capability for production sandboxed deployments.
What We Already Have
The foundational pieces exist but are not connected:
1. Base64 File Transfer (environments/tool_context.py)
Working upload_file() / download_file() with chunked base64 piping between host and sandbox. Currently only used by RL ToolContext — NOT exposed as an agent tool.
2. MEDIA: Tag System (gateway/platforms/base.py)
Agent includes MEDIA:/path/to/file in response text. BasePlatformAdapter.extract_media() detects these tags and routes files by extension (images → send_image, audio → send_voice, documents → send_document). Only works for host-local files.
3. Platform Send Methods
| Platform |
send_document |
send_image_file |
send_video |
Status |
| Telegram |
❌ MISSING |
❌ MISSING |
❌ MISSING |
Falls back to "📎 File: /path" text |
| Discord |
❌ MISSING |
❌ MISSING |
❌ MISSING |
Same fallback |
| Slack |
❌ MISSING |
❌ MISSING |
❌ MISSING |
Same fallback |
| WhatsApp |
✅ |
✅ |
✅ |
Full support via bridge |
Platform adapters are the #1 blocker. Even if send_file existed today, files can't be delivered on 3/4 messaging platforms.
Research
Investigated 50+ sources across cloud IDEs, container orchestration, CI/CD artifact systems, AI APIs, notebook environments, novel transfer protocols, and agent codebases.
How Other Agents Handle This
| Agent/Platform |
Approach |
| OpenHands |
Workspace SDK with file_upload()/file_download(), Docker cp, download workspace as ZIP |
| Codex / Claude Code |
OS-level sandboxing on local FS — avoids the problem entirely |
| Cursor Cloud |
Git as file transfer protocol (push to branch, pull from GitHub) |
| E2B |
sandbox.files.read()/write() + pre-signed URLs — gold standard |
| Composio |
Schema-annotated file_uploadable/file_downloadable + S3 intermediary |
| Modal |
Volume sync, sb.open() Filesystem API, CloudBucketMounts |
| Coder |
REST API with tar/zip upload to content-addressed store + Mutagen sync |
| Pi-Mono |
Runtime bridge with returnDownloadableFile() (web-specific) |
| Jupyter |
Contents REST API (base64-in-JSON) + /files/ static serving |
| Colab |
files.download() triggers browser download + Drive mount for persistence |
How AI APIs Handle Files
| API |
Pattern |
| OpenAI Responses |
Container → file_id → client.containers.files.content(cntr, file_id) |
| Anthropic Claude |
Files API → file_id reference → download for tool-created files only |
| Google Gemini |
Upload → file_id → 48-hour auto-expiry → no download (input only) |
| E2B |
Direct SDK read/write + sandbox.downloadUrl() with presigned URLs |
Container File Transfer Patterns
| Method |
Mechanism |
Pros |
Cons |
Best For |
| Docker Archive API |
get_archive/put_archive (tar stream) |
Zero base64 overhead, binary-safe, no container deps |
Requires Docker socket |
Docker sandboxes |
| docker cp |
Wraps Archive API |
Simple CLI |
Same as above |
Quick transfers |
| SCP/SFTP |
Direct file transfer over SSH |
Fast, resumable, proven |
Requires SSH setup |
SSH sandboxes |
| Base64 over exec |
base64 < file | decode |
Works on ANY exec channel |
33% overhead, slow, unreliable for large binary |
Universal fallback |
| kubectl cp |
tar stream over exec API |
Built into k8s |
Requires tar in container |
Kubernetes |
| Volume mounts |
Shared filesystem |
Zero-copy, real-time |
Must configure at start |
Known paths |
| Presigned URLs |
Upload to S3/storage, share URL |
Secure, scales, works in browsers |
Requires object storage |
Large files, messaging |
| transfer.sh |
Self-hosted HTTP file hosting |
Dead simple, one curl to upload |
Requires HTTP outbound |
Ephemeral sharing |
Key Insight: Docker Archive API > Base64
The current base64-over-exec approach is the worst performing option. Docker's Archive API is what docker cp uses internally:
# Download: tar stream, no base64 overhead, handles binary perfectly
bits, stat = container.get_archive("/workspace/report.pdf")
# Upload: accepts tar archive bytes
container.put_archive("/workspace/", tar_data)
Key Insight: Compression Before Base64
Inspired by the Kitty terminal file transfer protocol: gzip before base64 reduces payload 50-80% for text files. This should be the default for the base64 fallback path.
Messaging Platform File Limits
| Platform |
Upload Limit |
Notes |
| Telegram |
50 MB |
Self-hosted Bot API server removes limit (up to 2 GB) |
| Discord |
25 MB |
500 MB with Nitro |
| Slack |
1 GB |
Paid plans |
| WhatsApp |
100 MB |
Documents; 16 MB for images |
For files exceeding these limits → upload to temp storage, send download URL.
Proposed Solution
Architecture
┌─────────────────────────────────────────────────────┐
│ send_file Tool │
│ send_file(path, [message]) │
└──────────────────────┬──────────────────────────────┘
│
┌──────────────────────▼──────────────────────────────┐
│ File Transfer Layer (per-backend) │
│ Docker: get_archive API (tar stream) │
│ SSH: SCP/SFTP via ControlMaster connection │
│ Modal: sb.open() Filesystem API │
│ Singularity: bind mount or gzip+base64 over exec │
│ Local: direct filesystem (no transfer) │
└──────────────────────┬──────────────────────────────┘
│
┌──────────────────────▼──────────────────────────────┐
│ Delivery Layer (per-frontend) │
│ CLI: copy to CWD, print path │
│ Telegram: bot.send_document() (< 50 MB) │
│ Discord: discord.File() (< 25 MB) │
│ Slack: files_upload_v2() (< 1 GB) │
│ WhatsApp: send_document() via bridge │
│ Fallback: presigned URL for oversized files │
└─────────────────────────────────────────────────────┘
The send_file Tool
{
"name": "send_file",
"description": "Send a file from the terminal environment to the user. Works across all backends and platforms.",
"parameters": {
"path": {
"type": "string",
"description": "Path to the file inside the terminal environment"
},
"message": {
"type": "string",
"description": "Optional caption/message to send with the file"
}
}
}
Flow:
- Agent calls
send_file(path="/workspace/report.pdf")
- Tool checks file exists + gets size via terminal
- Local: file already on host, get absolute path
- Docker: extract via
get_archive API (tar stream, no base64)
- SSH: extract via SCP over existing ControlMaster
- Modal/Singularity: extract via gzip+base64 over exec
- Save to
~/.hermes/file_cache/<uuid>_<filename>
- CLI: copy to user CWD
- Gateway: return
MEDIA:<host_path> → platform sends via send_document()
User Upload → Sandbox (Receive Side)
- Gateway downloads user's file attachment →
~/.hermes/file_cache/
- Inject into sandbox via reverse of send_file transport
- Add context to conversation:
[User uploaded: report.csv (42 KB) at /workspace/uploads/report.csv]
Tiered Transfer Strategy
| Tier |
Size |
Strategy |
| 1 |
< 1 MB |
gzip + base64 over exec (any backend) |
| 2 |
1-25 MB |
Docker: get_archive. SSH: scp. Modal: base64. |
| 3 |
25-50 MB |
Extract to host → send via platform API |
| 4 |
> 50 MB |
Extract to host → upload to temp storage → send presigned URL |
Security
- File cache auto-cleanup (TTL-based, default 1 hour)
- Path traversal validation (realpath + prefix check)
- Configurable max file size (default 50 MB)
- MIME type detection via magic bytes (not extension)
- UUID-based cache filenames (original name in metadata only)
- Content-addressed storage (SHA-256) for dedup + integrity
Implementation Plan
Phase 0: Platform Adapter Gaps (PREREQUISITE) — ~2h
Fix the #1 blocker. Add missing methods to platform adapters:
Files to modify:
gateway/platforms/telegram.py — Add send_document() via bot.send_document()
gateway/platforms/discord.py — Add send_document() via discord.File()
gateway/platforms/slack.py — Add send_document() via files_upload_v2()
- Add
send_image_file() and send_video() overrides to all three
Phase 1: send_file Tool (MVP) — ~4h
Files to create:
tools/send_file_tool.py (~150 lines) — The tool itself
Files to modify:
tools/environments/base.py — Add download_file() to BaseEnvironment
tools/environments/docker.py — Implement via Docker Archive API
tools/environments/ssh.py — Implement via SCP/SFTP
model_tools.py — Register the new tool
toolsets.py — Add to appropriate toolsets (file toolset)
Phase 2: User Upload → Sandbox — ~4h
Files to modify:
gateway/run.py — Detect user file attachments, download to cache
tools/environments/base.py — Add upload_file() to BaseEnvironment
- Conversation injection — Add file context to user message
Phase 3: Large File Handling — ~4h (optional)
- Content-addressed file cache (
~/.hermes/file_cache/{hash[:2]}/{hash})
- S3/MinIO presigned URL generation for oversized files
- Auto-cleanup daemon with configurable TTL
- Telegram
file_id caching for dedup on re-sends
Estimated Effort
| Phase |
Effort |
Priority |
New Code |
| Phase 0: Platform gaps |
~2h |
PREREQUISITE |
~130 lines |
| Phase 1: send_file tool |
~4h |
HIGH |
~150 lines |
| Phase 2: User uploads |
~4h |
MEDIUM |
~100 lines |
| Phase 3: Large files |
~4h |
LOW |
~150 lines |
| Total |
~14h |
|
~530 lines |
Phases 0+1 deliver the core value. Phase 2 completes bidirectional flow. Phase 3 handles edge cases.
Design Doc
Full design document with backend-specific implementation notes: plans/file-transfer.md
Research Sources
Cloud IDEs: Gitpod, Coder, code-server, JetBrains Gateway, VS Code Remote
Notebooks: Jupyter, Colab, Kaggle, IPython FileLink
Containers: Docker Archive API, kubectl cp, Podman
CI/CD: GitHub Actions artifacts, GitLab CI, Jenkins stash/unstash
AI APIs: OpenAI Responses API, Anthropic Files API, Gemini File API, E2B
Novel tools: transfer.sh, croc, magic-wormhole, Kitty protocol, Mutagen, rclone
Protocols: tus (resumable uploads), WebDAV, 9P, ZMODEM, WebContainers
Agent codebases: OpenHands, Composio, Pi-Mono, Codex, Cline, OpenCode
Problem
When Hermes runs in a sandboxed terminal backend (Docker, SSH, Modal, Singularity), the agent creates files inside the sandbox but has no way to send them to the user. Similarly, when users send file attachments on messaging platforms, those files are not injected into the sandbox.
This is the #1 missing capability for production sandboxed deployments.
What We Already Have
The foundational pieces exist but are not connected:
1. Base64 File Transfer (
environments/tool_context.py)Working
upload_file()/download_file()with chunked base64 piping between host and sandbox. Currently only used by RL ToolContext — NOT exposed as an agent tool.2. MEDIA: Tag System (
gateway/platforms/base.py)Agent includes
MEDIA:/path/to/filein response text.BasePlatformAdapter.extract_media()detects these tags and routes files by extension (images →send_image, audio →send_voice, documents →send_document). Only works for host-local files.3. Platform Send Methods
Platform adapters are the #1 blocker. Even if
send_fileexisted today, files can't be delivered on 3/4 messaging platforms.Research
Investigated 50+ sources across cloud IDEs, container orchestration, CI/CD artifact systems, AI APIs, notebook environments, novel transfer protocols, and agent codebases.
How Other Agents Handle This
file_upload()/file_download(), Docker cp, download workspace as ZIPsandbox.files.read()/write()+ pre-signed URLs — gold standardfile_uploadable/file_downloadable+ S3 intermediarysb.open()Filesystem API, CloudBucketMountsreturnDownloadableFile()(web-specific)/files/static servingfiles.download()triggers browser download + Drive mount for persistenceHow AI APIs Handle Files
file_id→client.containers.files.content(cntr, file_id)file_idreference → download for tool-created files onlyfile_id→ 48-hour auto-expiry → no download (input only)sandbox.downloadUrl()with presigned URLsContainer File Transfer Patterns
get_archive/put_archive(tar stream)base64 < file | decodeKey Insight: Docker Archive API > Base64
The current base64-over-exec approach is the worst performing option. Docker's Archive API is what
docker cpuses internally:Key Insight: Compression Before Base64
Inspired by the Kitty terminal file transfer protocol: gzip before base64 reduces payload 50-80% for text files. This should be the default for the base64 fallback path.
Messaging Platform File Limits
For files exceeding these limits → upload to temp storage, send download URL.
Proposed Solution
Architecture
The send_file Tool
{ "name": "send_file", "description": "Send a file from the terminal environment to the user. Works across all backends and platforms.", "parameters": { "path": { "type": "string", "description": "Path to the file inside the terminal environment" }, "message": { "type": "string", "description": "Optional caption/message to send with the file" } } }Flow:
send_file(path="/workspace/report.pdf")get_archiveAPI (tar stream, no base64)~/.hermes/file_cache/<uuid>_<filename>MEDIA:<host_path>→ platform sends viasend_document()User Upload → Sandbox (Receive Side)
~/.hermes/file_cache/[User uploaded: report.csv (42 KB) at /workspace/uploads/report.csv]Tiered Transfer Strategy
get_archive. SSH:scp. Modal: base64.Security
Implementation Plan
Phase 0: Platform Adapter Gaps (PREREQUISITE) — ~2h
Fix the #1 blocker. Add missing methods to platform adapters:
Files to modify:
gateway/platforms/telegram.py— Addsend_document()viabot.send_document()gateway/platforms/discord.py— Addsend_document()viadiscord.File()gateway/platforms/slack.py— Addsend_document()viafiles_upload_v2()send_image_file()andsend_video()overrides to all threePhase 1: send_file Tool (MVP) — ~4h
Files to create:
tools/send_file_tool.py(~150 lines) — The tool itselfFiles to modify:
tools/environments/base.py— Adddownload_file()to BaseEnvironmenttools/environments/docker.py— Implement via Docker Archive APItools/environments/ssh.py— Implement via SCP/SFTPmodel_tools.py— Register the new tooltoolsets.py— Add to appropriate toolsets (file toolset)Phase 2: User Upload → Sandbox — ~4h
Files to modify:
gateway/run.py— Detect user file attachments, download to cachetools/environments/base.py— Addupload_file()to BaseEnvironmentPhase 3: Large File Handling — ~4h (optional)
~/.hermes/file_cache/{hash[:2]}/{hash})file_idcaching for dedup on re-sendsEstimated Effort
Phases 0+1 deliver the core value. Phase 2 completes bidirectional flow. Phase 3 handles edge cases.
Design Doc
Full design document with backend-specific implementation notes:
plans/file-transfer.mdResearch Sources
Cloud IDEs: Gitpod, Coder, code-server, JetBrains Gateway, VS Code Remote
Notebooks: Jupyter, Colab, Kaggle, IPython FileLink
Containers: Docker Archive API, kubectl cp, Podman
CI/CD: GitHub Actions artifacts, GitLab CI, Jenkins stash/unstash
AI APIs: OpenAI Responses API, Anthropic Files API, Gemini File API, E2B
Novel tools: transfer.sh, croc, magic-wormhole, Kitty protocol, Mutagen, rclone
Protocols: tus (resumable uploads), WebDAV, 9P, ZMODEM, WebContainers
Agent codebases: OpenHands, Composio, Pi-Mono, Codex, Cline, OpenCode