feat(gateway): single gateway, multiple agents by davidgut1982 · Pull Request #34741 · NousResearch/hermes-agent

davidgut1982 · 2026-05-29T17:12:27Z

What

Enables a single hermes gateway run process to host N isolated AI agents, routing inbound messages by platform/chat/thread/user metadata. Each agent has independent memory, skills, SOUL.md, and model config. Rebases and extends #25660 (original work by @02356abc) onto v0.15.0 with conflict resolution. Adds two commits: multi-agent routing for the api_server platform adapter, and a profile-scope fix ensuring use_profile() ContextVar stays bound across async chains. Routes through a routes: table and optional select_agent hook.

Files modified: gateway/agent_routing.py, gateway/config.py, gateway/delivery.py, gateway/platforms/api_server.py, gateway/platforms/base.py, gateway/session.py, agent/conversation_loop.py, agent/profile.py, and others (16 files scanned, all Windows-footguns-clean).

Why

Single-gateway bottleneck (#23735, #7517, #9514, #12099) limits multi-user and multi-workflow deployments. Multi-agent routing lets operators run multiple isolated agents in one process while keeping their state, models, and configurations separate. Session keys are namespaced (agent:<id>:...) and the feature is fully backward compatible (default single main agent is a no-op).

Tests

pytest tests/agent/test_profile_contextvar.py tests/gateway/test_agent_routing.py tests/gateway/test_session.py -v

All tests pass; existing single-agent behavior is unchanged.

Platforms tested

Linux (CT/LXC environment, Python 3.13)

alt-glitch · 2026-05-29T17:27:24Z

Rebase of #25660 onto v0.15.0 with two additional commits (API server routing + profile propagation). Original authorship by @02356abc preserved. Related follow-up issues: #25695, #25696, #25697, #25698.

romansoft · 2026-05-31T19:08:00Z

This PR looks like the right architectural direction for multi-profile gateway routing.

One related enhancement that would make it especially useful for OpenAI-compatible clients like Open WebUI, LobeChat, LibreChat, etc.:

Could the API server optionally expose Hermes profiles as selectable OpenAI “models” and route requests by the incoming model field?

Current direction in this PR appears to support API-server profile routing via headers like:

X-Hermes-Chat-Id
X-Hermes-User-Id
X-Hermes-Thread-Id

That is useful for custom clients, but most OpenAI-compatible web UIs do not provide a clean way to set per-request custom routing headers from the model dropdown. They do already send the selected model in the request body.

Desired behavior:
GET /v1/models
could return allowed Hermes profiles as model IDs:
{
"data": [
{ "id": "default", "object": "model", "owned_by": "hermes" },
{ "id": "researcher", "object": "model", "owned_by": "hermes" },
{ "id": "coder", "object": "model", "owned_by": "hermes" }
]
}

Then:
{
"model": "researcher",
"messages": [...]
}

would route the request into the researcher Hermes profile, including that profile’s config, SOUL.md, memory, skills, toolsets, and session scope.

Important security/operational constraints:

Only expose profiles explicitly allowlisted for API-server use.
Keep current behavior as the default for backward compatibility.
Reject unknown or unauthorized model/profile names with a clear 404/403-style OpenAI-compatible error.
Preserve existing header-based routing for custom clients.
Ideally support both /v1/chat/completions and /v1/responses.

This would let one Hermes API server appear in Open WebUI as multiple selectable agents/profiles, instead of needing one API server process and port per profile.

davidgut1982 · 2026-05-31T19:52:48Z

@romansoft strong +1 — and worth zooming out, because you and I are asking for the same thing a lot of people have been asking for.

The demand is already well-documented. This PR alone references #23735, #7517, #9514, #12099. Beyond those, the "one gateway, many agents/personas" ask shows up repeatedly and independently:

feat: single gateway, multiple agents (MVP) #25660 — the multi-agent routing core this PR rebases (open since 2026-05-14, unreviewed)
feat: per-channel model and system prompt overrides for gateway platforms #1955 / PR feat: per-channel model and system prompt overrides for gateway platforms (Fixes #1955) #1991 — per-channel model + system-prompt overrides
[Feature] Topic-to-Profile routing: dispatch messages to different profiles based on Telegram topic/thread #10143 / PR feat(telegram): route forum topics to Hermes profiles #18510 — Telegram topic → profile routing
[Feature]: Per-channel profile routing for Discord (single bot, single gateway) #19809 — Discord channel → profile routing
Feature Request: Per-platform model configuration #14327, API server ignores per-platform model config (no way to run api_server on a different model than the global default) #34612 / PRs feat: per-platform default model/provider overrides #11439, feat(gateway): opt-in per-platform model via platform_models (#34612) #34620 — per-platform model
[Feature]: one gateway serves multiple agents — switch via /profile <name> or @<name> #24913 — "one gateway serves multiple agents, switch via /profile or @name"

Multiple authors, multiple platforms, multiple competing implementations — and several explicitly reject the current "run a separate gateway process per agent" answer as too heavyweight (especially for single-credential platforms). Your /v1/models idea is the cleanest client-facing expression of the same need: it makes all of that usable from the model dropdown in Open WebUI / LobeChat / LibreChat with zero client-side config.

So what's actually stopping it? Not effort — the code largely exists (this PR, #25660, #1991, #18510, #11439…). The blocker is that none of it is merged and there's no ratified design yet for how per-target routing should work: agents:/routes: (this PR) vs channel_overrides (#1991) vs topic_profiles (#18510) vs per-platform model. They overlap and partly conflict, and there's the separate agent_profiles: (delegation) concept in #9459 that uses the word "agent" too. Building a /v1/models surface on top of any one of these means betting on which routing mechanism wins review — and if it lands a different shape, the dropdown layer churns.

That's the only reason I'd sequence it as a follow-up rather than fold it in here: get the routing core ratified first, then add model-field → profile resolution on top (happy to author it, with your constraints — allowlist, backward-compatible default, clean OpenAI-shaped 404/403 for unknown profiles, headers preserved, both /v1/chat/completions and /v1/responses).

To be clear, I'm not trying to drive the direction here — just helping where I can and happy to adapt to whatever shape the maintainers prefer. The thing that would unblock all of it is a steer on the routing primitive. @teknium1 / @alt-glitch — is there a preferred direction among these (this PR's routes: table vs the per-channel/topic override approaches)? Once there's a blessed primitive, the client-facing layer @romansoft is describing is a small, well-scoped follow-up, and I'm glad to do the legwork to match whatever's chosen.

Introduce AgentProfile dataclass and a ContextVar (_current_agent_profile) that lets path getters (get_hermes_home, get_skills_dir, get_memory_dir) resolve to the active agent's home directory under asyncio. - agent/profile.py: AgentProfile, use_profile() context manager, load_agent_registry() from GatewayConfig - hermes_constants.py: get_hermes_home() reads ContextVar before env fallback - tests/agent/test_profile_contextvar.py: ContextVar isolation under asyncio.gather, nested contexts, registry loading Single-agent installs see zero change — no profile bound means fallback to HERMES_HOME env var as before.

Add agent_id field to SessionSource and SessionEntry, prefix session keys with agent:<id>: in build_session_key. Default "main" preserves every historical key string for single-agent installs. - gateway/session.py: SessionSource.agent_id, SessionEntry.agent_id, build_session_key prefixing - hermes_state.py: sessions table migration (agent_id TEXT DEFAULT 'main'), new idx_sessions_agent index - tests/gateway/test_session.py: build_session_key prefixing for all chat_type × agent_id combinations - tests/*/test_session_boundary_hooks.py: hook payload agent_id kwarg

… hook Add declarative routing (routes: match → agent) and a select_agent plugin hook. _attach_agent_id injects the resolved agent_id into event.source before build_session_key. Seven platform adapters get pre-injection for batching paths; the rest inherit it from base.py. - gateway/agent_routing.py: resolve_agent_id(), _route_matches() - gateway/config.py: agents, routes, default_agent schema - gateway/platforms/base.py: _attach_agent_id(), set_routing_context() - gateway/platforms/{telegram,discord,slack,matrix,feishu,wecom,yuanbao}.py: pre-batch injection - hermes_cli/plugins.py: select_agent hook registration - tests/gateway/test_agent_routing.py: declared-order matching, hook chain, default fallback, profile isolation

…s agent_id to hooks GatewayRunner loads the agent registry at init and wraps every inbound message in use_profile(). AIAgent accepts an optional profile= kwarg. All invoke_hook call sites gain agent_id= kwarg. _handle_message is split into _handle_message (ContextVar plumbing) + _handle_message_inner (legacy logic) so tests that grep the source body continue to work. - gateway/run.py: registry loading, use_profile() wrapping, hook kwargs - run_agent.py: AIAgent(profile=), profile-aware model/toolset resolution - model_tools.py, tools/{approval,terminal,delegate}.py: hook agent_id - cli.py, tui_gateway/server.py: session boundary hook agent_id - tests/gateway/test_profile_overrides.py: per-agent model/toolset overrides - tests/test_model_tools.py: hook payload verification - tests/gateway/test_{update,title,reasoning}_command.py: adapt to _handle_message split

…veries Cron tick and delivery routing now bind the correct profile before execution. jobs.py does NOT persist agent_id in JSON — the directory is the identity. Delivery uses nullcontext() for the unrouted case. - cron/jobs.py: in-memory agent_id stamping at read time, directory-based identity (no JSON field) - cron/scheduler.py: use_profile() wrapper in tick path - gateway/delivery.py: use_profile() wrapper per delivery target - tests/cron/test_scheduler.py: agent_id propagation in delivery targets

New hermes agent subcommand group: list, show, add, remove. Manages agent profiles and routing config in ~/.hermes/config.yaml. - hermes_cli/agent.py: cmd_agent_list, cmd_agent_show, cmd_agent_add, cmd_agent_remove with profile cloning and route cleanup - hermes_cli/main.py: parser registration - tests/hermes_cli/test_agent_cli.py: list/show/add/remove coverage, route orphan warnings, SOUL summarization

The OpenAI-compatible HTTP adapter was the one inbound surface from PR NousResearch#25660 that never called ``_attach_agent_id`` — every ``/v1/chat/completions``, ``/v1/responses``, and ``/v1/runs`` request fell through to ``default_agent`` regardless of the configured routes, silently undermining the multi-agent guarantee on any deployment that exposes the API server. Add a single routing entry point, ``_resolve_agent_profile``, that: * Reads ``X-Hermes-Chat-Id`` / ``X-Hermes-User-Id`` / ``X-Hermes-Thread-Id`` from the request (sanitised through the same length + control-char caps as the existing ``X-Hermes-Session-Id`` / ``X-Hermes-Session-Key``). * Builds a synthetic ``SessionSource(platform=API_SERVER, …)`` and pipes it through the shared ``_attach_agent_id`` hook so declarative routes *and* the ``select_agent`` plugin hook fire identically to every other adapter. * Looks up the resolved ``agent_id`` in ``self._gateway_ref._agent_registry`` and returns the matching ``AgentProfile`` (or ``None`` for legacy single-agent installs). The three agent-invoking handlers (chat completions, responses, runs) now resolve the profile up front and bind it via ``use_profile`` for the duration of the run. Binding happens twice — once on the asyncio side and once inside the executor thread — because asyncio's default executor does not propagate ContextVars. Behaviour is fully backward compatible: requests with no routing headers (the existing OpenAI-API contract) resolve to ``default_agent``, exactly the current behaviour. New tests in ``tests/gateway/test_api_server_routing.py`` cover: * Header sanitisation (CRLF rejection, length caps, whitespace). * Route resolution: matching, no-header fall-through, unmatched header fall-through, ``platform``-only catch-all, ``user_id`` and ``thread_id`` routes, route-order precedence. * Resilience: missing gateway reference, empty registry. * ContextVar isolation under ``asyncio.gather`` so two concurrent HTTP requests with different chat_ids stay isolated. Refs: PR NousResearch#25660 (single-gateway multi-agent).

…t_id Two Code Critic WARN findings from multi-agent apply review: 1. api_server.py: move _active_run_agents registration inside the `with use_profile` block so any post-construction lazy attribute access on the asyncio thread sees the correct per-agent profile. The executor thread re-binds independently in _run_sync. 2. delegate_tool.py: document why _build_child_progress_callback uses `subagent_id=` (TUI spawn-tree identity) while invoke_hook uses `agent_id=` (multi-agent routing). Different consumers, same run.

davidgut1982 · 2026-06-02T16:28:20Z

Superseded by a chain of 6 smaller, focused PRs per CONTRIBUTING.md guidance (one logical change, reviewable in ~15 min). Each builds on the previous and should be merged in order:

Chain: #37495 → #37496 → #37497 → #37498 → #37500 → #37502

feat(agent): add AgentProfile + ContextVar for per-agent paths (1/6) #37495 feat(agent): AgentProfile + ContextVar for per-agent paths
feat(session): thread agent_id through session identity & DB schema (2/6) #37496 feat(session): thread agent_id through session identity & DB schema
feat(gateway): route inbound messages via routes table + GatewayRunner profile binding (3/6) #37497 feat(gateway): routes table + select_agent hook + GatewayRunner profile binding
feat(cli): add hermes agent subcommand for multi-agent management (4/6) #37498 feat(cli): hermes agent subcommand
feat(cron+api): propagate agent_id through jobs/deliveries + wire api_server routing (5/6) #37500 feat(cron+api): propagate agent_id through jobs/deliveries + api_server routing
docs+fix(gateway): multi-agent routing guide + use_profile scope fix (6/6) #37502 docs+fix: multi-agent routing guide + use_profile scope fix

The chain is rebased onto current main; the net diff of #37502 over main is the same 47-file feature set as this PR, with three minor cherry-pick conflicts resolved (documented in the respective PR bodies).

davidgut1982 force-pushed the feat/single-gateway-multi-agent branch from 9cd1a37 to f855a5f Compare May 29, 2026 17:23

davidgut1982 mentioned this pull request Jun 1, 2026

feat(dashboard): read-only Agents view for delegation agent_profiles #36147

Draft

02356abc and others added 9 commits June 2, 2026 16:00

docs: multi-agent routing guide + sample config

048bcff

davidgut1982 force-pushed the feat/single-gateway-multi-agent branch from f855a5f to 1828ac2 Compare June 2, 2026 16:08

davidgut1982 closed this Jun 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(gateway): single gateway, multiple agents#34741

feat(gateway): single gateway, multiple agents#34741
davidgut1982 wants to merge 9 commits into
NousResearch:mainfrom
davidgut1982:feat/single-gateway-multi-agent

davidgut1982 commented May 29, 2026 •

edited

Loading

Uh oh!

alt-glitch commented May 29, 2026

Uh oh!

romansoft commented May 31, 2026

Uh oh!

davidgut1982 commented May 31, 2026

Uh oh!

davidgut1982 commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

davidgut1982 commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

Tests

Platforms tested

Uh oh!

alt-glitch commented May 29, 2026

Uh oh!

romansoft commented May 31, 2026

Uh oh!

davidgut1982 commented May 31, 2026

Uh oh!

davidgut1982 commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

davidgut1982 commented May 29, 2026 •

edited

Loading