You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Users on messenger platforms (Telegram, Discord, WhatsApp, Slack) currently have limited options for getting files to Hermes Agent. Small files can be sent directly as platform attachments (20MB Telegram limit, 25MB Discord), but these land in ephemeral cache directories that auto-clean after 24 hours. There's no way to upload large files, bulk upload multiple documents, or send files from a device that isn't running the messenger client.
This proposal introduces a /upload command that spins up an ephemeral upload website via a tunneled connection, gives the user a URL, and accepts file uploads through a browser. The tunnel auto-shuts down after a configurable timeout (default 5 minutes). This complements the workspace feature (see companion issue) by providing a gateway for remote users to populate their workspace.
The approach: run a lightweight HTTP upload server on localhost, expose it through a zero-config tunnel (Cloudflare Quick Tunnel — free, no account needed), send the URL to the user, accept uploads, save to workspace, tear down. Simple, ephemeral, secure.
No existing AI agent platform offers this workflow. Most assume local filesystem access or rely on platform-native attachment limits. This would be a genuinely novel capability.
Research Findings
Current File Handling in Hermes Agent
Incoming files from platforms (already works):
Telegram: bot.get_file() → download to document_cache/. 20MB hard limit via Bot API. Text files (.md, .txt ≤100KB) have content injected into message text.
Discord: attachment.save() → download from CDN. 25MB limit (50-100MB for boosted servers).
WhatsApp: Two-step Graph API download (GET media URL → download with auth token).
Slack: files.info → download with auth header. Three-step upload process for sending.
Limitations:
All files go to ephemeral cache dirs (UUID-prefixed, flat, 24h auto-cleanup)
No bulk upload capability
No cross-device upload (can't send from laptop to Telegram bot on phone)
No way to bypass platform size limits
No persistent file organization
Tunneling Solutions Evaluated
Solution
Config Required
Account Needed
HTTPS
Max Upload
Install
Cost
cloudflared
Zero
No
✓
100MB through CF
binary
Free
localtunnel
Zero
No
✓
Unlimited
npm/npx
Free
bore
Zero
No
✗ (TCP only)
Unlimited
cargo/brew
Free
ngrok
Auth token
Yes (free tier)
✓
Unlimited
binary
Freemium
Recommendation: cloudflared Quick Tunnel as primary, localtunnel as fallback.
Node.js dependency (already needed for WhatsApp bridge)
Self-hostable server available
Less reliable than cloudflared but good fallback
How Other Platforms Handle File Upload
No other AI agent/bot framework offers tunnel-based file upload. The closest patterns:
ChatGPT: Direct file upload in web UI (drag & drop), stored server-side
Telegram Bot API: Platform-native attachment handling (20MB limit, no workaround without custom server)
Discord bots: CDN-hosted attachments, downloaded by bot
Ephemeral file sharing services: transfer.sh (defunct), 0x0.st, file.io — upload via curl, get download URL. These are intermediaries, not tunnel-based.
The tunnel approach is novel: instead of routing files through a third-party service, we expose the agent's own upload endpoint directly. Files never leave the user→agent path.
Current State in Hermes Agent
What we have:
Platform adapters already download user-sent files (cache_document_from_bytes() in base.py)
MEDIA:/path protocol for sending files back to users
This is a core codebase change + bundled skill hybrid:
The /upload command and tunnel management are core (need gateway command integration)
The upload web server could be a standalone Python module
The upload page HTML/JS is a static asset
cloudflared detection and fallback logic needs to be in the core
Upload Flow
User: /upload (or /upload 30 for 30-minute window)
│
├─ Agent starts HTTP upload server on random localhost port
├─ Agent launches cloudflared quick tunnel → gets public URL
├─ Agent generates single-use upload token
├─ Agent sends URL + token to user:
│ "📤 Upload files here (link expires in 5 min):
│ https://random-words.trycloudflare.com/u/abc123
│ Max 100MB per file. Drag & drop or click to browse."
│
├─ User opens URL in browser
│ ├─ Clean upload page with drag-and-drop zone
│ ├─ File type indicators (📄 PDF, 📊 Excel, etc.)
│ ├─ Upload progress bar
│ └─ "Upload complete!" confirmation
│
├─ Files saved to ~/.hermes/workspace/uploads/
├─ Agent confirms in chat:
│ "✅ Received 3 files:
│ - quarterly-report.pdf (2.4 MB)
│ - meeting-notes.md (12 KB)
│ - data.csv (847 KB)
│ Saved to workspace. I can read these now!"
│
└─ Tunnel + server shut down (timeout or /upload stop)
What We'd Need
Upload HTTP server (tools/upload_server.py or gateway/upload.py) — lightweight async HTTP server (aiohttp or built-in http.server) with multipart file upload handling
Upload web page — single-page HTML/JS with drag-and-drop, progress bars, file type validation. Embedded as a string constant or bundled asset.
Tunnel manager — detect cloudflared availability, launch quick tunnel, parse URL from stdout, manage lifecycle, fallback to localtunnel
/upload command — gateway slash command with optional timeout argument
Upload token system — single-use or time-limited tokens for auth (prevent random internet users from uploading)
File validation — size limits, type allowlists, malware scanning (ClamAV if available)
Workspace integration — save uploaded files to ~/.hermes/workspace/uploads/, update manifest
Security Model
1. Upload token: random 32-char token, single-use or time-limited
2. URL structure: https://tunnel-url/u/{token} — token required for access
3. Size limit: 100MB per file (configurable), 500MB total per session
4. File type allowlist: documents, code, data, images, archives
Blocked: executables (.exe, .sh, .bat), system files
5. Rate limiting: max 20 files per upload session
6. Auto-shutdown: tunnel killed after timeout (default 5 min)
7. No directory traversal: filenames sanitized, saved to fixed directory
8. HTTPS only: cloudflared provides TLS (never expose plain HTTP)
Phased Rollout
Phase 1: Basic Upload with Cloudflared
/upload [minutes] command in gateway (all platforms) and CLI
Minimal HTTP upload server (Python stdlib http.server + multipart parsing)
Simple but functional upload page (HTML form + JS progress)
cloudflared dependency: Needs the cloudflared binary installed. Mitigated by fallback chain and clear error messages ("install cloudflared: brew install cloudflared")
Network exposure: Even with token auth, exposing a local server to the internet has inherent risk. Mitigated by short timeout, token validation, file type restrictions, size limits.
Cloudflare Quick Tunnel reliability: No SLA, intended for development. For a 5-minute upload window this is fine; for long-running sessions, less reliable.
Firewall/NAT issues: Some corporate networks block outbound tunnel connections. Fallback to direct platform upload for these cases.
Complexity: Adding an HTTP server + tunnel management + upload UI is non-trivial. Phase 1 should be minimal to prove the concept.
No upload from CLI: CLI users don't need tunnels — they have direct filesystem access. The /upload command should detect CLI and just print the workspace path.
Open Questions
cloudflared vs localtunnel as primary: cloudflared is more reliable and faster but requires a binary install. localtunnel is npm-based (already a dependency for WhatsApp). Recommendation: cloudflared primary, localtunnel fallback.
Upload page hosting: Embed HTML as a Python string constant, or serve from a static file? Recommendation: embed as constant for zero-dependency deployment, with option to override via ~/.hermes/upload-page.html.
Automatic vs manual tunnel: Should the agent auto-detect "user wants to send me a file" and offer upload? Or only on explicit /upload command? Recommendation: explicit command only in Phase 1; smart detection ("I have a PDF to share") in Phase 3.
CLI behavior: What should /upload do in CLI mode? Options: (a) print workspace path, (b) open file picker dialog, (c) start local server without tunnel (for LAN access). Recommendation: print workspace path + optional local server for remote CLI sessions.
Overview
Users on messenger platforms (Telegram, Discord, WhatsApp, Slack) currently have limited options for getting files to Hermes Agent. Small files can be sent directly as platform attachments (20MB Telegram limit, 25MB Discord), but these land in ephemeral cache directories that auto-clean after 24 hours. There's no way to upload large files, bulk upload multiple documents, or send files from a device that isn't running the messenger client.
This proposal introduces a
/uploadcommand that spins up an ephemeral upload website via a tunneled connection, gives the user a URL, and accepts file uploads through a browser. The tunnel auto-shuts down after a configurable timeout (default 5 minutes). This complements the workspace feature (see companion issue) by providing a gateway for remote users to populate their workspace.The approach: run a lightweight HTTP upload server on localhost, expose it through a zero-config tunnel (Cloudflare Quick Tunnel — free, no account needed), send the URL to the user, accept uploads, save to workspace, tear down. Simple, ephemeral, secure.
No existing AI agent platform offers this workflow. Most assume local filesystem access or rely on platform-native attachment limits. This would be a genuinely novel capability.
Research Findings
Current File Handling in Hermes Agent
Incoming files from platforms (already works):
bot.get_file()→ download todocument_cache/. 20MB hard limit via Bot API. Text files (.md,.txt≤100KB) have content injected into message text.attachment.save()→ download from CDN. 25MB limit (50-100MB for boosted servers).files.info→ download with auth header. Three-step upload process for sending.Limitations:
Tunneling Solutions Evaluated
Recommendation: cloudflared Quick Tunnel as primary, localtunnel as fallback.
Cloudflare Quick Tunnel details:
localtunnel details:
npx localtunnel --port 8080 # Output: https://random.loca.ltHow Other Platforms Handle File Upload
No other AI agent/bot framework offers tunnel-based file upload. The closest patterns:
The tunnel approach is novel: instead of routing files through a third-party service, we expose the agent's own upload endpoint directly. Files never leave the user→agent path.
Current State in Hermes Agent
What we have:
cache_document_from_bytes()inbase.py)MEDIA:/pathprotocol for sending files back to userssend_filetool for agent→user file transfer + upload via presigned URLs (different approach, complementary)What we don't have:
/uploadcommandImplementation Plan
Core Architecture: Classification
This is a core codebase change + bundled skill hybrid:
/uploadcommand and tunnel management are core (need gateway command integration)Upload Flow
What We'd Need
tools/upload_server.pyorgateway/upload.py) — lightweight async HTTP server (aiohttp or built-in http.server) with multipart file upload handling/uploadcommand — gateway slash command with optional timeout argument~/.hermes/workspace/uploads/, update manifestSecurity Model
Phased Rollout
Phase 1: Basic Upload with Cloudflared
/upload [minutes]command in gateway (all platforms) and CLIhttp.server+ multipart parsing)~/.hermes/workspace/uploads/(or~/.hermes/uploads/if workspace not yet implemented)/upload stopto manually tear downPhase 2: Enhanced UX + Fallback
/uploads list,/uploads delete)Phase 3: Smart Upload + Integration
Pros & Cons
Pros
Cons / Risks
cloudflaredbinary installed. Mitigated by fallback chain and clear error messages ("install cloudflared:brew install cloudflared")/uploadcommand should detect CLI and just print the workspace path.Open Questions
cloudflared vs localtunnel as primary: cloudflared is more reliable and faster but requires a binary install. localtunnel is npm-based (already a dependency for WhatsApp). Recommendation: cloudflared primary, localtunnel fallback.
Upload page hosting: Embed HTML as a Python string constant, or serve from a static file? Recommendation: embed as constant for zero-dependency deployment, with option to override via
~/.hermes/upload-page.html.Automatic vs manual tunnel: Should the agent auto-detect "user wants to send me a file" and offer upload? Or only on explicit
/uploadcommand? Recommendation: explicit command only in Phase 1; smart detection ("I have a PDF to share") in Phase 3.Integration with feat: File transfer between sandboxed environments and users (send_file tool) #466 (send_file): Issue feat: File transfer between sandboxed environments and users (send_file tool) #466 proposes presigned URLs for large file transfers. Should upload use the same mechanism? Recommendation: different mechanisms for different directions.
/uploaduses tunnels (user→agent),send_fileuses presigned URLs or platform APIs (agent→user).CLI behavior: What should
/uploaddo in CLI mode? Options: (a) print workspace path, (b) open file picker dialog, (c) start local server without tunnel (for LAN access). Recommendation: print workspace path + optional local server for remote CLI sessions.References
gateway/platforms/base.py(cache_document_from_bytes)