You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Inspired by the Pi coding agent (GitHub), this proposes adding an RPC (Remote Procedure Call) mode to Hermes Agent -- a JSON protocol over stdin/stdout that enables programmatic control of the agent from any language or IDE. Pi offers four integration modes (interactive, print, JSON streaming, and RPC), and the RPC mode is what enables its rich ecosystem of IDE plugins, web UIs, and embedded integrations.
Currently, Hermes Agent runs in two modes: CLI (interactive terminal) and Gateway (Telegram/Discord/WhatsApp/Slack). There's no way for external programs to programmatically start sessions, send messages, switch models, or observe agent state. An RPC mode would unlock IDE integrations (VS Code, Neovim, Emacs), custom web UIs, CI/CD agent automation, and embedding Hermes as a component in larger systems.
This also naturally enables mid-session model hot-swapping -- a frequently requested capability that Pi implements via the same RPC protocol.
Research Findings
How Pi's RPC Mode Works
Pi's RPC mode uses bidirectional JSON-line protocol over stdin/stdout. The host process spawns Pi with --mode rpc and communicates via structured messages:
No programmatic interface -- External programs can't control the agent
No streaming protocol -- Gateway platforms get complete responses; no progressive streaming to external clients
No model switching -- Model is set at session creation, fixed for the session
No state observability -- External systems can't query agent state (is it running? what tools is it using?)
No IDE integration path -- VS Code, Neovim, Emacs can't embed Hermes
Relevant Existing Code
cli.py -- HermesCLI class, could be adapted for RPC dispatch
run_agent.py -- AIAgent with callbacks (tool_progress, step, clarify)
gateway/run.py -- GatewayRunner with message handling, command parsing
gateway/session.py -- SessionStore for session management
Agent callbacks already exist for tool progress, clarification, and step tracking -- these map naturally to RPC events
Implementation Plan
Skill vs. Tool Classification
This is a core codebase change -- it adds a new runtime mode alongside CLI and Gateway. It touches the agent startup, message flow, and output handling at a fundamental level. Cannot be expressed as a skill or tool.
Python SDK: HermesClient class wrapping subprocess + RPC protocol
fromhermesimportHermesClientagent=HermesClient(model="anthropic/claude-sonnet-4-20250514")
foreventinagent.prompt("Fix the bug in main.py"):
ifevent.type=="message_delta":
print(event.content, end="")
VS Code extension skeleton (TypeScript, spawns hermes RPC)
Neovim plugin skeleton (Lua, spawns hermes RPC)
HTTP/WebSocket adapter -- wrap RPC protocol for web UIs
Documentation: protocol spec, integration guide, example clients
Pros & Cons
Pros
IDE integration -- VS Code, Neovim, Emacs, JetBrains can embed Hermes as a coding assistant
Custom UIs -- Anyone can build a web UI, desktop app, or mobile app around Hermes
CI/CD automation -- Programmatically run agent tasks in pipelines with structured output
Model hot-swapping -- Switch models mid-session based on task complexity (cheap model for simple, strong for hard)
Composability -- Other agents/systems can use Hermes as a component via RPC
Language agnostic -- JSON Lines works from any language, not just Python
Leverages existing code -- AIAgent callbacks already provide the right abstraction layer
Cons / Risks
Protocol maintenance -- RPC protocol becomes a public API contract; breaking changes are costly
Complexity -- Third runtime mode adds surface area for bugs and testing
Streaming complexity -- Proper streaming with abort/steer requires careful state management
Security -- RPC mode inherits the spawning process's permissions; need clear documentation on trust boundaries
Model switching complexity -- Hot-swapping models mid-session may cause inconsistencies (different tokenizers, different tool calling formats)
Open Questions
Should RPC mode support authentication, or assume the spawning process is trusted?
Should the protocol support binary data (images, audio) or only text/JSON?
Should RPC mode be a separate entry point (hermes rpc) or a flag (hermes --mode rpc)?
How should model switching interact with prompt caching (cache invalidation on model change)?
Should we support WebSocket transport in addition to stdin/stdout for web UI use cases?
Should the Python SDK be a separate package or part of the main hermes-agent install?
Overview
Inspired by the Pi coding agent (GitHub), this proposes adding an RPC (Remote Procedure Call) mode to Hermes Agent -- a JSON protocol over stdin/stdout that enables programmatic control of the agent from any language or IDE. Pi offers four integration modes (interactive, print, JSON streaming, and RPC), and the RPC mode is what enables its rich ecosystem of IDE plugins, web UIs, and embedded integrations.
Currently, Hermes Agent runs in two modes: CLI (interactive terminal) and Gateway (Telegram/Discord/WhatsApp/Slack). There's no way for external programs to programmatically start sessions, send messages, switch models, or observe agent state. An RPC mode would unlock IDE integrations (VS Code, Neovim, Emacs), custom web UIs, CI/CD agent automation, and embedding Hermes as a component in larger systems.
This also naturally enables mid-session model hot-swapping -- a frequently requested capability that Pi implements via the same RPC protocol.
Research Findings
How Pi's RPC Mode Works
Pi's RPC mode uses bidirectional JSON-line protocol over stdin/stdout. The host process spawns Pi with
--mode rpcand communicates via structured messages:Client-to-Agent Commands:
{"type": "prompt", "text": "Fix the bug in main.py"} {"type": "steer", "text": "Actually, try a different approach"} {"type": "follow_up", "text": "Now add tests"} {"type": "abort"} {"type": "set_model", "provider": "anthropic", "model": "claude-sonnet-4-20250514"} {"type": "compact"} {"type": "get_state"} {"type": "get_messages", "from": 5} {"type": "switch_session", "path": "path/to/session.jsonl"} {"type": "fork"}Agent-to-Client Events:
{"type": "state", "state": "idle|running|waiting_for_input"} {"type": "message_start", "role": "assistant"} {"type": "message_delta", "content": "partial text..."} {"type": "message_end"} {"type": "tool_call", "name": "bash", "args": {"command": "ls"}} {"type": "tool_result", "name": "bash", "output": "file1.py..."} {"type": "error", "message": "..."} {"type": "compact_done", "summary": "..."}Extension UI Forwarding:
When extensions request UI (dialogs, confirmations), the RPC protocol forwards these to the host:
{"type": "ui_request", "kind": "confirm", "prompt": "Delete this file?", "id": "req_123"} // Host responds: {"type": "ui_response", "id": "req_123", "value": true}Key Design Decisions
set_modelcommand enables mid-session model switching without restarting.Real-World RPC Integrations
Pi's RPC mode has enabled:
Current State in Hermes Agent
Existing Integration Modes
cli.py,HermesCLI) -- Interactive terminal with readline, streaming output, slash commandsgateway/run.py,GatewayRunner) -- Platform adapters for Telegram, Discord, WhatsApp, SlackHow Messages Flow Currently
What's Missing
Relevant Existing Code
cli.py--HermesCLIclass, could be adapted for RPC dispatchrun_agent.py--AIAgentwith callbacks (tool_progress, step, clarify)gateway/run.py--GatewayRunnerwith message handling, command parsinggateway/session.py--SessionStorefor session managementImplementation Plan
Skill vs. Tool Classification
This is a core codebase change -- it adds a new runtime mode alongside CLI and Gateway. It touches the agent startup, message flow, and output handling at a fundamental level. Cannot be expressed as a skill or tool.
What We'd Need
hermes --mode rpcorhermes rpcPhased Rollout
Phase 1: Core RPC Protocol
RpcRunnerclass that reads commands from stdin, dispatches to AIAgent, emits events to stdout--mode rpcflag to CLI entry pointPhase 2: Advanced Control
steercommand -- queue message to deliver after current tool (relates to Feature: Message Coalescing for Gateway Platforms #345 message coalescing)follow_upcommand -- queue message for after agent completesset_modelcommand -- mid-session model switching/modelslash command for interactive mode toocompactcommand -- trigger context compressionget_sessions/switch_sessioncommands for session managementui_requesteventsPhase 3: SDK & Ecosystem
HermesClientclass wrapping subprocess + RPC protocolPros & Cons
Pros
Cons / Risks
Open Questions
hermes rpc) or a flag (hermes --mode rpc)?References
rpc.ts