Feature: RPC Mode for Programmatic Integration & Model Hot-Swapping (inspired by Pi)

## Overview

Inspired by the [Pi coding agent](https://pi.dev/) ([GitHub](https://github.com/badlogic/pi-mono)), this proposes adding an RPC (Remote Procedure Call) mode to Hermes Agent -- a JSON protocol over stdin/stdout that enables programmatic control of the agent from any language or IDE. Pi offers four integration modes (interactive, print, JSON streaming, and RPC), and the RPC mode is what enables its rich ecosystem of IDE plugins, web UIs, and embedded integrations.

Currently, Hermes Agent runs in two modes: CLI (interactive terminal) and Gateway (Telegram/Discord/WhatsApp/Slack). There's no way for external programs to programmatically start sessions, send messages, switch models, or observe agent state. An RPC mode would unlock IDE integrations (VS Code, Neovim, Emacs), custom web UIs, CI/CD agent automation, and embedding Hermes as a component in larger systems.

This also naturally enables **mid-session model hot-swapping** -- a frequently requested capability that Pi implements via the same RPC protocol.

---

## Research Findings

### How Pi's RPC Mode Works

Pi's RPC mode uses bidirectional JSON-line protocol over stdin/stdout. The host process spawns Pi with `--mode rpc` and communicates via structured messages:

**Client-to-Agent Commands:**
```json
{"type": "prompt", "text": "Fix the bug in main.py"}
{"type": "steer", "text": "Actually, try a different approach"}
{"type": "follow_up", "text": "Now add tests"}
{"type": "abort"}
{"type": "set_model", "provider": "anthropic", "model": "claude-sonnet-4-20250514"}
{"type": "compact"}
{"type": "get_state"}
{"type": "get_messages", "from": 5}
{"type": "switch_session", "path": "path/to/session.jsonl"}
{"type": "fork"}
```

**Agent-to-Client Events:**
```json
{"type": "state", "state": "idle|running|waiting_for_input"}
{"type": "message_start", "role": "assistant"}
{"type": "message_delta", "content": "partial text..."}
{"type": "message_end"}
{"type": "tool_call", "name": "bash", "args": {"command": "ls"}}
{"type": "tool_result", "name": "bash", "output": "file1.py..."}
{"type": "error", "message": "..."}
{"type": "compact_done", "summary": "..."}
```

**Extension UI Forwarding:**
When extensions request UI (dialogs, confirmations), the RPC protocol forwards these to the host:
```json
{"type": "ui_request", "kind": "confirm", "prompt": "Delete this file?", "id": "req_123"}
// Host responds:
{"type": "ui_response", "id": "req_123", "value": true}
```

### Key Design Decisions

1. **JSON Lines protocol** -- One JSON object per line, simple to parse in any language. No framing overhead, no binary protocol complexity.
2. **Streaming events** -- Message content streams as deltas, matching the LLM streaming pattern. Clients can render progressively.
3. **Bidirectional** -- Not just request/response. The agent can initiate UI requests, the client can steer mid-generation.
4. **State machine** -- Agent reports state transitions (idle/running/waiting), enabling proper UI state management.
5. **Model control** -- `set_model` command enables mid-session model switching without restarting.

### Real-World RPC Integrations

Pi's RPC mode has enabled:
- **[pi-coding-agent (Emacs)](https://github.com/dnouri/pi-coding-agent)** -- Emacs package that populates Markdown buffers for chat
- **[OpenClaw/clawdbot](https://github.com/clawdbot/clawdbot)** -- Discord bot embedding Pi via SDK
- **Custom web UIs** -- Several community projects wrapping Pi in web interfaces
- **Pz (Zig port)** -- Uses RPC for language-agnostic integration

---

## Current State in Hermes Agent

### Existing Integration Modes

1. **CLI mode** (`cli.py`, `HermesCLI`) -- Interactive terminal with readline, streaming output, slash commands
2. **Gateway mode** (`gateway/run.py`, `GatewayRunner`) -- Platform adapters for Telegram, Discord, WhatsApp, Slack

### How Messages Flow Currently

```
User input -> Platform adapter -> GatewayRunner.handle_message()
           -> SessionStore.get_or_create_session()
           -> AIAgent.run_conversation()
           -> Tool calls loop
           -> Response -> Platform adapter -> User
```

### What's Missing

- **No programmatic interface** -- External programs can't control the agent
- **No streaming protocol** -- Gateway platforms get complete responses; no progressive streaming to external clients
- **No model switching** -- Model is set at session creation, fixed for the session
- **No state observability** -- External systems can't query agent state (is it running? what tools is it using?)
- **No IDE integration path** -- VS Code, Neovim, Emacs can't embed Hermes

### Relevant Existing Code

- `cli.py` -- `HermesCLI` class, could be adapted for RPC dispatch
- `run_agent.py` -- `AIAgent` with callbacks (tool_progress, step, clarify)
- `gateway/run.py` -- `GatewayRunner` with message handling, command parsing
- `gateway/session.py` -- `SessionStore` for session management
- Agent callbacks already exist for tool progress, clarification, and step tracking -- these map naturally to RPC events

---

## Implementation Plan

### Skill vs. Tool Classification

This is a **core codebase change** -- it adds a new runtime mode alongside CLI and Gateway. It touches the agent startup, message flow, and output handling at a fundamental level. Cannot be expressed as a skill or tool.

### What We'd Need

- JSON Lines protocol specification (commands + events)
- RPC runner that wraps AIAgent with stdin/stdout protocol
- Streaming event emission during agent execution
- State machine for agent lifecycle (idle/running/waiting)
- Model hot-swapping support in AIAgent
- CLI flag: `hermes --mode rpc` or `hermes rpc`

### Phased Rollout

**Phase 1: Core RPC Protocol**
- Define JSON Lines protocol specification (commands: prompt, abort, get_state, get_messages; events: state, message_delta, tool_call, tool_result, error, done)
- Implement `RpcRunner` class that reads commands from stdin, dispatches to AIAgent, emits events to stdout
- Wire AIAgent callbacks (tool_progress, step) to RPC event emission
- Add `--mode rpc` flag to CLI entry point
- Basic state machine: idle -> running -> idle
- Integration test: Python subprocess spawning hermes in RPC mode

**Phase 2: Advanced Control**
- Add `steer` command -- queue message to deliver after current tool (relates to #345 message coalescing)
- Add `follow_up` command -- queue message for after agent completes
- Add `set_model` command -- mid-session model switching
  - Requires AIAgent changes: model/config stored mutably, provider client re-initialized
  - Add `/model` slash command for interactive mode too
- Add `compact` command -- trigger context compression
- Add `get_sessions` / `switch_session` commands for session management
- Clarify callback forwarding -- agent's clarify questions become RPC `ui_request` events

**Phase 3: SDK & Ecosystem**
- Python SDK: `HermesClient` class wrapping subprocess + RPC protocol
  ```python
  from hermes import HermesClient
  agent = HermesClient(model="anthropic/claude-sonnet-4-20250514")
  for event in agent.prompt("Fix the bug in main.py"):
      if event.type == "message_delta":
          print(event.content, end="")
  ```
- VS Code extension skeleton (TypeScript, spawns hermes RPC)
- Neovim plugin skeleton (Lua, spawns hermes RPC)
- HTTP/WebSocket adapter -- wrap RPC protocol for web UIs
- Documentation: protocol spec, integration guide, example clients

---

## Pros & Cons

### Pros
- **IDE integration** -- VS Code, Neovim, Emacs, JetBrains can embed Hermes as a coding assistant
- **Custom UIs** -- Anyone can build a web UI, desktop app, or mobile app around Hermes
- **CI/CD automation** -- Programmatically run agent tasks in pipelines with structured output
- **Model hot-swapping** -- Switch models mid-session based on task complexity (cheap model for simple, strong for hard)
- **Composability** -- Other agents/systems can use Hermes as a component via RPC
- **Language agnostic** -- JSON Lines works from any language, not just Python
- **Leverages existing code** -- AIAgent callbacks already provide the right abstraction layer

### Cons / Risks
- **Protocol maintenance** -- RPC protocol becomes a public API contract; breaking changes are costly
- **Complexity** -- Third runtime mode adds surface area for bugs and testing
- **Streaming complexity** -- Proper streaming with abort/steer requires careful state management
- **Security** -- RPC mode inherits the spawning process's permissions; need clear documentation on trust boundaries
- **Model switching complexity** -- Hot-swapping models mid-session may cause inconsistencies (different tokenizers, different tool calling formats)

---

## Open Questions

- Should RPC mode support authentication, or assume the spawning process is trusted?
- Should the protocol support binary data (images, audio) or only text/JSON?
- Should RPC mode be a separate entry point (`hermes rpc`) or a flag (`hermes --mode rpc`)?
- How should model switching interact with prompt caching (cache invalidation on model change)?
- Should we support WebSocket transport in addition to stdin/stdout for web UI use cases?
- Should the Python SDK be a separate package or part of the main hermes-agent install?

---

## References

- [Pi coding agent](https://pi.dev/) -- [GitHub repo](https://github.com/badlogic/pi-mono)
- [Pi RPC implementation](https://github.com/badlogic/pi-mono/tree/main/packages/coding-agent/src/modes) -- `rpc.ts`
- [pi-coding-agent Emacs package](https://github.com/dnouri/pi-coding-agent) -- RPC client example
- [HN discussion](https://news.ycombinator.com/item?id=47143754) -- community using RPC for integrations
- Related: #345 (Message Coalescing) -- steer/follow_up pattern
- Related: #299 (Multi-agent support) -- RPC enables embedding agents as sub-processes
- Pi is MIT licensed -- protocol design can be freely adapted


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: RPC Mode for Programmatic Integration & Model Hot-Swapping (inspired by Pi) #360

Overview

Research Findings

How Pi's RPC Mode Works

Key Design Decisions

Real-World RPC Integrations

Current State in Hermes Agent

Existing Integration Modes

How Messages Flow Currently

What's Missing

Relevant Existing Code

Implementation Plan

Skill vs. Tool Classification

What We'd Need

Phased Rollout

Pros & Cons

Pros

Cons / Risks

Open Questions

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Feature: RPC Mode for Programmatic Integration & Model Hot-Swapping (inspired by Pi) #360

Description

Overview

Research Findings

How Pi's RPC Mode Works

Key Design Decisions

Real-World RPC Integrations

Current State in Hermes Agent

Existing Integration Modes

How Messages Flow Currently

What's Missing

Relevant Existing Code

Implementation Plan

Skill vs. Tool Classification

What We'd Need

Phased Rollout

Pros & Cons

Pros

Cons / Risks

Open Questions

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions