Skip to content

feat: single gateway, multiple agents (MVP)#25660

Open
02356abc wants to merge 7 commits into
NousResearch:mainfrom
02356abc:feat/single-gateway-multi-agent
Open

feat: single gateway, multiple agents (MVP)#25660
02356abc wants to merge 7 commits into
NousResearch:mainfrom
02356abc:feat/single-gateway-multi-agent

Conversation

@02356abc

@02356abc 02356abc commented May 14, 2026

Copy link
Copy Markdown
Contributor

Summary

Enable a single hermes gateway run process to host N isolated AI agents,
routing inbound messages by platform/chat/thread/user metadata while keeping
each agent's memory, skills, SOUL.md, and model config fully separate.

Fixes the bottleneck behind #23735, #7517, #9514, and #12099.

Deployment scenario matrix

Scenario Gateways Agents Status
Single user, single personality 1 1 (main) Zero behavior change
Single user, multi personality 1 N All fields wired
Team multi-tenant 1 N All fields wired
HA / sharding N N/gateway Each gateway loads its own config subset
Environment separation N 1/gateway Different HERMES_HOME per gateway

Architecture (8 commits)

  1. Session identityagent_id in SessionSource/SessionEntry, build_session_key prefix, SQLite migration
  2. AgentProfile + ContextVar — per-agent filesystem root, model, toolsets; use_profile() propagates through async chains
  3. Declarative routingroutes: list with 9 match keys, first-match-wins; select_agent plugin hook override
  4. GatewayRunner wiring — registry loading, profile wrapping, _apply_profile_runtime_overrides, _apply_profile_toolsets
  5. Cron + Delivery propagationCronJob.agent_id, per-profile storage, DeliveryTarget.agent_id
  6. CLIhermes agent list/add/remove/show
  7. Documentation — DESIGN.md, scenario matrix, data flow diagram, config examples
  8. Attribution — AUTHOR_MAP entry

Precedence chain

Session /model override → Profile override → Gateway default

The default "main" profile is a no-op overlay; existing single-agent
installs see zero behavior change.

Migration Guide

Existing single-agent users (no action required)

No configuration changes needed. The default default_agent: main ensures
all existing behavior is preserved. Your existing ~/.hermes/ directory
continues to work as the main agent profile.

Adding a second agent

# 1. Create the agent profile
hermes agent add coder --model anthropic/claude-opus-4-6

# 2. (Optional) Clone from existing profile
hermes agent add coder --from-profile main --model anthropic/claude-opus-4-6

# 3. Configure routing in ~/.hermes/config.yaml
agents:
  main: {}
  coder:
    model: anthropic/claude-opus-4-6
    home_dir: ~/.hermes/profiles/coder
routes:
  - match: { platform: telegram, chat_id: "-1001234" }
    agent: coder

# 4. Create SOUL.md for the new agent
mkdir -p ~/.hermes/profiles/coder
cp ~/.hermes/SOUL.md ~/.hermes/profiles/coder/SOUL.md
# Edit profiles/coder/SOUL.md to define coder's personality

# 5. Restart gateway
hermes gateway run

Consolidating multiple gateway processes

Before this PR: hermes -p coder gateway run + hermes -p research gateway run

After this PR:

  1. Stop all gateway processes
  2. Move profile directories to ~/.hermes/profiles/<name>/
  3. Configure routes in a single config.yaml
  4. Start one gateway process

Performance Impact

Metric Single-agent baseline Multi-agent (3 agents) Delta
ContextVar read N/A ~50ns Negligible
_agent_cache 128 slots for 1 agent 128 slots shared across N agents May hit cap sooner; LRU handles eviction
Session key length agent:main:... (+9 chars) agent:<id>:... Minimal memory impact
Routing resolution Direct to main Routes table + hook chain ~0.1ms per message (cached)

No measurable throughput regression for single-agent configs.

Tests

File Count Coverage
tests/agent/test_profile_contextvar.py 25 AgentProfile, ContextVar, async isolation
tests/gateway/test_agent_routing.py 25 Route matching, declaration order, invalid routes
tests/gateway/test_session.py 12 build_session_key with agent_id across all chat types
tests/gateway/test_profile_overrides.py 12 Runtime and toolset override helpers
tests/hermes_cli/test_agent_cli.py 24 hermes agent list/show/add/remove commands
tests/gateway/test_session_boundary_hooks.py updated Hook agent_id assertions
tests/test_model_tools.py updated Hook call signatures with agent_id

Multi-agent suite: 181 passed
Full regression: 22677 passed / 38 failed (pre-existing env issues) / 105 skipped

E2E Validation

Matrix → code agent routing validated with local Dendrite homeserver:

  • Matrix DM/room messages correctly route to code agent
  • Weixin/WeCom regression tests pass (continue routing to main/wecom-agent)
  • Session isolation verified: agent:code:matrix:dm:... session keys

Full report: docs/plans/2026-05-15-multi-agent-matrix-e2e-report.md

Non-goals (future PRs)

Verification commands

pytest tests/gateway/test_agent_routing.py -v
pytest tests/agent/test_profile_contextvar.py -v
pytest tests/gateway/test_session.py -v
pytest tests/gateway/test_profile_overrides.py -v
pytest tests/hermes_cli/test_agent_cli.py -v

Manual smoke checklist

  1. Message to unmatched chat → routes to main, session_key agent:main:...
  2. Message to Telegram forum topic 42 → routes to coder, session_key agent:coder:...
  3. Say "I'm Alice" in topic 42, "I'm Bob" in another → each agent remembers only its own name
  4. Different SOUL.md per profile → responses match respective personalities
  5. Enable filesystem toolset for coder only → research agent cannot access filesystem
  6. /new in topic 42 → on_session_finalize receives agent_id="coder"
  7. Create cron job in coder profile → file lands in profiles/coder/cron/jobs.json
  8. Trigger delivery from coder session → executes in coder context
  9. Plugin returns "research" from select_agent hook → overrides route match
  10. Restart gateway, message previous chat → restores session with correct agent_id
  11. Delete agents: and routes: from config → all messages route to main
  12. Old sessions.db auto-migrates, old rows backfill to "main"

@alt-glitch alt-glitch added type/feature New feature or request P2 Medium — degraded but workaround exists comp/gateway Gateway runner, session dispatch, delivery comp/agent Core agent loop, run_agent.py, prompt builder comp/cli CLI entry point, hermes_cli/, setup wizard comp/cron Cron scheduler and job management platform/telegram Telegram bot adapter platform/discord Discord bot adapter platform/slack Slack app adapter platform/feishu Feishu / Lark adapter platform/wecom WeCom / WeChat Work adapter platform/matrix Matrix adapter (E2EE) labels May 14, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Note: This supersedes #25008 (closed). Same feature scope — single gateway, multiple agents MVP. Related feature requests: #7517, #9514, #12099.

@discolotus

Copy link
Copy Markdown

Tracked follow-up technical debt from this PR:

@02356abc

Copy link
Copy Markdown
Contributor Author

CI Test Failure Analysis

The test job failure is entirely due to pre-existing environment issues in the CI runner, not caused by this PR.

Verified locally

All tests related to this PR pass locally (313+ tests):

Test File Status
tests/agent/test_profile_contextvar.py 25 passed
tests/gateway/test_agent_routing.py 25 passed
tests/gateway/test_session.py 12 passed
tests/gateway/test_profile_overrides.py 12 passed
tests/cli/test_session_boundary_hooks.py 4 passed
tests/cron/test_scheduler.py 121 passed
tests/cron/test_file_permissions.py 8 passed
tests/test_model_tools.py passed

CI failure breakdown (all pre-existing)

Category Files Root Cause
Missing dependencies test_bedrock_adapter.py, test_bedrock_integration.py, test_bedrock_model_picker.py, test_transcription.py botocore, faster_whisper not installed in CI
OpenSSL/cryptography version test_wecom_callback.py, test_weixin.py, test_platform_http_client_limits.py cffi / cryptography API mismatch
Environment/mock limitations test_auxiliary_client.py, test_dingtalk.py, test_feishu_bot_admission.py, test_matrix.py CI sandbox restrictions
Model/provider config test_provider_parity.py, test_compression_feasibility.py, test_switch_model_context.py No runtime provider configured
Plugin/tool registry drift test_plugin_discovery.py, test_registry.py New providers added since test was written
Signal handling test_mcp_stability.py SIGKILL not available in container
Module import test_tts_kittentts.py Import error in test module

None of these failures are related to the multi-agent changes introduced in this PR.

@02356abc

Copy link
Copy Markdown
Contributor Author

@discolotus Thanks for tracking these follow-ups! All four items are already documented in the DESIGN.md file under the "Non-Goals (Future PRs)" section with the same issue numbers you listed. The design doc explicitly scopes them out of this MVP to keep the PR reviewable.

@02356abc

Copy link
Copy Markdown
Contributor Author

E2E Test Report — Multi-Agent Routing Validation

We completed end-to-end validation of the multi-agent routing feature. Here is the summary:

Test Matrix

Scenario Status Evidence
Matrix → code agent PASS Session key: agent:code:matrix:dm:!u00jd7u1b1WqHly1:localhost
Weixin → main (regression) PASS No agent:code prefix in weixin logs
WeCom → wecom-agent (regression) PASS Route preserved, no errors
Profile isolation (sessions, memory, SOUL) PASS Unit tests + config verified
Gateway restart resilience PASS Routing config persisted after restart
pytest automation 38/38 passed tests/gateway/test_agent_routing.py

Configuration Used

default_agent: main
agents:
  main: {}
  wecom-agent:
    home_dir: /root/.hermes/profiles/wecom-agent
  code:
    model: kimi-for-coding
    provider: moonshot
    home_dir: /root/.hermes/profiles/code
routes:
  - match: { platform: wecom }
    agent: wecom-agent
  - match: { platform: matrix }
    agent: code

Kanban Subsystem Impact Analysis

The Kanban subsystem requires zero code changes. Key findings:

  1. Shared data layerkanban_home() uses get_default_hermes_root(), not get_hermes_home(). The board DB is cross-profile by design.
  2. Worker isolation via subprocess — Dispatcher spawns hermes -p <assignee> as independent OS processes with their own HERMES_HOME, fully isolated from the gateway's ContextVar.
  3. Background tasks are agent-agnostic_kanban_notifier_watcher and _kanban_dispatcher_watcher never consult agent_id or the active ContextVar.

Configuration convention: Kanban task assignee must name a valid Hermes profile or agents: key. If unresolvable, the dispatcher records spawn failures and auto-blocks after failure_limit retries.

Full details: docs/plans/2026-05-15-multi-agent-matrix-e2e-report.md and DESIGN.md "Interaction with Existing Subsystems" section.

@02356abc

Copy link
Copy Markdown
Contributor Author

@alt-glitch This PR is ready for review. Here's a summary of what's been addressed since the initial submission:

Changes since last review

  1. CI failures verified as pre-existing — The test and e2e failures are identical on main branch (same 7 tests: test_provider_parity ×3 + test_discord_adapter ×4). All multi-agent tests pass locally (181/181).

  2. E2E validation completed — Matrix → code agent routing tested with local Dendrite homeserver. Session keys correctly show agent:code:matrix:dm:.... Weixin/WeCom regression clean.

  3. CLI tests added — 24 tests for hermes agent list/show/add/remove, including profile clone and route cleanup validation.

  4. Bug fixcmd_agent_add --from-profile no longer copies entire HERMES_HOME when cloning from main.

  5. Documentation updated — DESIGN.md now includes:

    • Migration Guide (single-agent → multi-agent, process consolidation)
    • Performance Impact analysis
    • Kanban subsystem interaction section
    • E2E test report

Key design decisions for reviewer attention

  • ContextVar propagationuse_profile() wraps the entire message handling path. Verified with asyncio.gather isolation tests.
  • Session key formatagent:<id>:<platform>:... preserves backward compat (default "main" produces same keys as before).
  • Cache safety_agent_cache keyed by session_key naturally supports multi-agent without code changes.
  • Hook payload — All invoke_hook calls include agent_id= so plugins can branch on the active agent.

Please let me know if you'd like any section expanded or if there are specific areas you'd like me to walk through.

@02356abc 02356abc force-pushed the feat/single-gateway-multi-agent branch 3 times, most recently from 673123a to 48894d4 Compare May 17, 2026 07:23
@02356abc

Copy link
Copy Markdown
Contributor Author

Force-pushed: rewrote commit history from 11 commits to 7 focused commits.

Line count breakdown by category

Category Lines Share
Production code 1,422 46.4%
Tests 1,350 44.0%
Docs / Config / Attribution 294 9.6%
Total 3,066

Key point: tests + docs together account for 53.6% of the diff.
The actual production code change is ~1,400 lines across 7 atomic commits.

Production code surface (1,422 lines)

Only 16 files contain production code changes; the rest are tests, docs, or config:

Area Files What changed
New modules (3 files) agent/profile.py, gateway/agent_routing.py, hermes_cli/agent.py 3 new small modules (~520 LOC total)
Gateway core (4 files) gateway/run.py, gateway/session.py, gateway/config.py, gateway/platforms/base.py Registry loading, profile binding, routing injection, session key prefixing
Platform adapters (7 files) telegram, discord, slack, matrix, feishu, wecom, yuanbao Single _attach_agent_id() call each (~3–11 LOC)
Cron + Delivery (3 files) cron/jobs.py, cron/scheduler.py, gateway/delivery.py use_profile() boundary wrapping
Hook propagation (4 files) model_tools.py, tools/approval.py, tools/terminal_tool.py, tools/delegate_tool.py agent_id= kwarg on invoke_hook
Path getters (1 file) hermes_constants.py ContextVar read before env fallback
CLI wiring (2 files) cli.py, tui_gateway/server.py Hook agent_id= kwarg
Plugin hook (1 file) hermes_cli/plugins.py select_agent hook registration

What was removed vs the previous 11-commit version

  • DESIGN.md and docs/plans/* — development artifacts, not for merge
  • Standalone fix: CI test failures commit — folded into the commits that introduced the issues
  • cron/jobs.py JSON agent_id field and cross-profile fallback block — simplified to directory-only identity

Verification

  • Full pytest suite: 22,498 passed (failures are env-only: missing acp/textual/voice hardware)
  • Matrix E2E: 6/6 assertions passed — DM→coder, group→main, same-room key isolation, ContextVar path isolation, route resolution, legacy fallback

@02356abc 02356abc force-pushed the feat/single-gateway-multi-agent branch from 48894d4 to 730d92c Compare May 17, 2026 08:04
@02356abc

Copy link
Copy Markdown
Contributor Author

Force-pushed (rebased onto latest main + fixed run_agent.py refactor migration).

What changed in this push

Rebased onto latest mainmain had 17 refactor commits that extracted run_conversation from run_agent.py into agent/conversation_loop.py. Our agent_id hook injections have been migrated to the new file.

agent/conversation_loop.py (+63 lines) — 6 invoke_hook call sites now include agent_id kwarg:

  • on_session_start
  • pre_llm_call
  • pre_api_request
  • post_api_request
  • transform_llm_output
  • post_llm_call
  • on_session_end

run_agent.py is now a thin forwarder (no hook calls), so no injection needed there.

Line count breakdown by category

Category Lines Share
Production code ~1,400 46%
Tests ~1,350 44%
Docs / Config / Attribution ~294 10%
Total 2,984

PR status

  • mergeable_state: unstabletrue (mergeable, CI pending)
  • commits: 7
  • changed_files: 46
  • Zero file conflicts vs latest main

@02356abc

Copy link
Copy Markdown
Contributor Author

Rebased onto latest main (519657a) and resolved conflicts from the run_agent.py refactor.

What changed since last push:

  • Migrated agent_id hook injections from run_agent.py to the new agent/conversation_loop.py (extracted in mainline)
  • All 6 invoke_hook call sites in conversation_loop.py now propagate agent_id via ContextVar
  • Zero behavior change for single-agent users; all session keys still default to agent:main: prefix

Test results after rebase:

Test suite Result
tests/gateway/test_agent_routing.py 38 passed
tests/gateway/test_session.py 92 passed
tests/run_agent/test_tool_executor_contextvar_propagation.py 5 passed
tests/cron/test_scheduler.py 121 passed
tests/gateway/test_matrix.py 147 passed
Total core multi-agent tests 403 passed

Commit breakdown (7 commits):

  1. bedf57a5d feat(agent): add AgentProfile + ContextVar for per-agent paths
  2. ad1d79521 feat(session): thread agent_id through session identity & DB schema
  3. ec1efc564 feat(gateway): route inbound messages via routes table + select_agent hook
  4. c4bedd48e feat(gateway): GatewayRunner loads registry, binds profile, propagates agent_id to hooks
  5. 9492a0104 feat(cron+delivery): propagate agent_id through scheduled jobs & deliveries
  6. 42979189e feat(cli): add hermes agent subcommand for multi-agent management
  7. 730d92ccd docs: multi-agent routing guide + sample config + AUTHOR_MAP

Ready for review.

02356abc added 7 commits May 17, 2026 19:13
Introduce AgentProfile dataclass and a ContextVar (_current_agent_profile)
that lets path getters (get_hermes_home, get_skills_dir, get_memory_dir)
resolve to the active agent's home directory under asyncio.

- agent/profile.py: AgentProfile, use_profile() context manager,
  load_agent_registry() from GatewayConfig
- hermes_constants.py: get_hermes_home() reads ContextVar before env fallback
- tests/agent/test_profile_contextvar.py: ContextVar isolation under
  asyncio.gather, nested contexts, registry loading

Single-agent installs see zero change — no profile bound means fallback
to HERMES_HOME env var as before.
Add agent_id field to SessionSource and SessionEntry, prefix session keys
with agent:<id>: in build_session_key. Default "main" preserves every
historical key string for single-agent installs.

- gateway/session.py: SessionSource.agent_id, SessionEntry.agent_id,
  build_session_key prefixing
- hermes_state.py: sessions table migration (agent_id TEXT DEFAULT 'main'),
  new idx_sessions_agent index
- tests/gateway/test_session.py: build_session_key prefixing for all
  chat_type × agent_id combinations
- tests/*/test_session_boundary_hooks.py: hook payload agent_id kwarg
… hook

Add declarative routing (routes: match → agent) and a select_agent plugin
hook. _attach_agent_id injects the resolved agent_id into event.source
before build_session_key. Seven platform adapters get pre-injection for
batching paths; the rest inherit it from base.py.

- gateway/agent_routing.py: resolve_agent_id(), _route_matches()
- gateway/config.py: agents, routes, default_agent schema
- gateway/platforms/base.py: _attach_agent_id(), set_routing_context()
- gateway/platforms/{telegram,discord,slack,matrix,feishu,wecom,yuanbao}.py:
  pre-batch injection
- hermes_cli/plugins.py: select_agent hook registration
- tests/gateway/test_agent_routing.py: declared-order matching, hook chain,
  default fallback, profile isolation
…s agent_id to hooks

GatewayRunner loads the agent registry at init and wraps every inbound
message in use_profile(). AIAgent accepts an optional profile= kwarg.
All invoke_hook call sites gain agent_id= kwarg. _handle_message is
split into _handle_message (ContextVar plumbing) + _handle_message_inner
(legacy logic) so tests that grep the source body continue to work.

- gateway/run.py: registry loading, use_profile() wrapping, hook kwargs
- run_agent.py: AIAgent(profile=), profile-aware model/toolset resolution
- model_tools.py, tools/{approval,terminal,delegate}.py: hook agent_id
- cli.py, tui_gateway/server.py: session boundary hook agent_id
- tests/gateway/test_profile_overrides.py: per-agent model/toolset overrides
- tests/test_model_tools.py: hook payload verification
- tests/gateway/test_{update,title,reasoning}_command.py: adapt to
  _handle_message split
…veries

Cron tick and delivery routing now bind the correct profile before
execution. jobs.py does NOT persist agent_id in JSON — the directory
is the identity. Delivery uses nullcontext() for the unrouted case.

- cron/jobs.py: in-memory agent_id stamping at read time, directory-based
  identity (no JSON field)
- cron/scheduler.py: use_profile() wrapper in tick path
- gateway/delivery.py: use_profile() wrapper per delivery target
- tests/cron/test_scheduler.py: agent_id propagation in delivery targets
New hermes agent subcommand group: list, show, add, remove.
Manages agent profiles and routing config in ~/.hermes/config.yaml.

- hermes_cli/agent.py: cmd_agent_list, cmd_agent_show, cmd_agent_add,
  cmd_agent_remove with profile cloning and route cleanup
- hermes_cli/main.py: parser registration
- tests/hermes_cli/test_agent_cli.py: list/show/add/remove coverage,
  route orphan warnings, SOUL summarization
@02356abc 02356abc force-pushed the feat/single-gateway-multi-agent branch from 730d92c to 49789b1 Compare May 17, 2026 11:14
@azharkov78

Copy link
Copy Markdown

I noticed an issue with multi-agent cron: when the scheduler calls get_all_due_jobs(registry), it iterates ALL agents — so the same cron job (with a fixed deliver target) gets executed N times, producing duplicate deliveries to the same chat.

In my setup with 5 agents (main, coder, reviewer, wife, matrix), every cron job with deliver: telegram:<my_chat_id> was delivered 5 times. The explicit deliver: target bypasses per-agent TELEGRAM_HOME_CHANNEL filtering — each agent's execution sends to the same target regardless of its own env config.

Proposed fix: add an optional cron_enabled: bool flag to AgentProfile (defaulting to True for backward compat). In load_all_jobs() and get_all_due_jobs(), skip agents where cron_enabled: false:

python
cron/jobs.py — in both load_all_jobs() and get_all_due_jobs()for agent_id, profile in registry.items():
if not profile.config_overrides.get('cron_enabled', True):
continue # skip agents opted out of cron participation
...

Users would then set in config.yaml:

yaml
agents:
main: {}
coder:
home_dir: ~/.hermes/profiles/coder
cron_enabled: false # this agent won't participate in cron ticks
This gives fine-grained control without breaking existing single-agent setups or complicating the routing logic. The key insight is that config_overrides already forwards unknown keys from the agents dict — so no schema changes are needed in the config model itself.

Happy to submit a PR if this aligns with the direction.

@davidgut1982

Copy link
Copy Markdown
Contributor

Hi @02356abc — thanks for this PR; the architecture here is exactly what self-hosted multi-agent setups need. We've been building on it locally (test install + a follow-on patch wiring _attach_agent_id into gateway/platforms/api_server.py, which is the one platform adapter the MVP doesn't currently cover).

This PR is in CONFLICTING state vs current main — we'd like to help if useful. A few ways we could:

  1. We could rebase the branch on current main and share it back for you to pull/force-push
  2. You could grant push access and we'd push the rebase directly
  3. We could open a follow-on PR that includes the rebased base + our api_server wiring + co-author credit, with a clear reference back to this PR

To unblock our downstream work, we need this architecture landed in some form by 2026-05-30 (two days from this comment). If we haven't heard from you by then, we'll go with option 3 to keep things moving — but our strong preference is collaborating with you directly. The architecture is sound; we just need the base PR in mergeable shape.

Happy to discuss tradeoffs or design questions if any of #25695-#25698 are blockers from your side.

@davidgut1982

Copy link
Copy Markdown
Contributor

Hi @02356abc — thanks for this PR; the architecture is exactly what's needed for self-hosted multi-agent setups.

We have a follow-on patch wiring _attach_agent_id into api_server.py (currently it only fires on messaging adapters), and noticed this PR is in CONFLICTING state vs current main. To unblock review on our side, we've rebased your branch on current main here: https://github.com/davidgut1982/hermes-agent/tree/feat/single-gateway-multi-agent

Upstream has moved 1174 commits since the PR's base (f36c89cd). All conflicts were mechanical or semantic-easy:

  • hermes_constants.py — upstream added a get_hermes_home_override() ContextVar check; merged it as step 2 in the resolution order, after the PR's AgentProfile check (per the PR's own docstring priority ordering).
  • gateway/platforms/telegram.py — upstream added _apply_telegram_group_observe_attribution(event); the PR adds _attach_agent_id(event) at the same callsite. Kept both: attribution first, then agent stamp (idempotent either order).
  • gateway/run.py (3 hunks) — upstream added adapter._busy_text_mode assignment in two adapter-setup paths; the PR adds set_routing_context(...) at the same sites. Kept both. Third hunk: upstream added _run_planned_stop_watcher() (Windows drain fix) immediately before _start_cron_ticker; the PR added a registry=None parameter to that function. Kept the new function and the signature change.
  • cron/jobs.py — upstream added _IMMUTABLE_JOB_FIELDS and _job_output_dir() (path-escape safety); the PR adds dynamic _get_cron_dir()/_get_jobs_file()/_get_output_dir() helpers. Kept both; updated _job_output_dir to call _get_output_dir() so it respects per-agent ContextVar paths.
  • cron/scheduler.py — upstream's job partition only covered workdir; the PR extends it to also cover profile jobs as sequential. Kept the PR's broader partition (it's a strict superset).
  • agent/conversation_loop.py — upstream refactored inline system-prompt restore logic into _restore_or_build_system_prompt(); the PR has the inline version plus agent_id in the on_session_start hook call. Kept the upstream refactored helper (better logging/state tracking) and added the agent_id propagation to the helper itself.
  • hermes_cli/main.py — upstream added "bundles" to _BUILTIN_SUBCOMMANDS; the PR adds "agent". Kept both.
  • README.md — upstream updated "Runs anywhere" row text; the PR adds a "Multi-agent routing" feature row. Kept both rows, used upstream's "Seven terminal backends" count.

All 38 routing tests still pass (tests/gateway/test_agent_routing.py), and all 38 profile/override tests pass (tests/agent/test_profile_contextvar.py, tests/gateway/test_profile_overrides.py). Imports clean.

If it'd help, we're happy to:

  1. Push the rebased commits into this PR's branch (need write access), OR
  2. You pull the rebased branch into your fork and force-push to this PR, OR
  3. We open a follow-on PR with the api_server wiring + this rebased base combined, crediting you as co-author.

Happy to do whichever works for you. We'd like this architecture to land so we can build on it.

@02356abc

Copy link
Copy Markdown
Contributor Author

Thanks @davidgut1982 — option 3 works great for me. Go ahead and open the follow-on PR with the rebased base + your api_server.py wiring; happy to be credited as co-author.

The more the community can build on top of this architecture, the better. Would love to see more contributors involved — if anyone else has patches or ideas around the multi-agent surface (#25695#25698), now is a good time to jump in.

@azharkov78 — the cron duplication issue you raised (#25695 area) is real and worth fixing. If you want to submit a PR targeting the follow-on, that would be a great way to contribute.

davidgut1982 added a commit to davidgut1982/hermes-agent that referenced this pull request Jun 2, 2026
The OpenAI-compatible HTTP adapter was the one inbound surface from
PR NousResearch#25660 that never called ``_attach_agent_id`` — every
``/v1/chat/completions``, ``/v1/responses``, and ``/v1/runs`` request
fell through to ``default_agent`` regardless of the configured routes,
silently undermining the multi-agent guarantee on any deployment that
exposes the API server.

Add a single routing entry point, ``_resolve_agent_profile``, that:

  * Reads ``X-Hermes-Chat-Id`` / ``X-Hermes-User-Id`` / ``X-Hermes-Thread-Id``
    from the request (sanitised through the same length + control-char
    caps as the existing ``X-Hermes-Session-Id`` / ``X-Hermes-Session-Key``).
  * Builds a synthetic ``SessionSource(platform=API_SERVER, …)`` and
    pipes it through the shared ``_attach_agent_id`` hook so declarative
    routes *and* the ``select_agent`` plugin hook fire identically to
    every other adapter.
  * Looks up the resolved ``agent_id`` in
    ``self._gateway_ref._agent_registry`` and returns the matching
    ``AgentProfile`` (or ``None`` for legacy single-agent installs).

The three agent-invoking handlers (chat completions, responses, runs)
now resolve the profile up front and bind it via ``use_profile`` for
the duration of the run.  Binding happens twice — once on the asyncio
side and once inside the executor thread — because asyncio's default
executor does not propagate ContextVars.

Behaviour is fully backward compatible: requests with no routing
headers (the existing OpenAI-API contract) resolve to
``default_agent``, exactly the current behaviour.

New tests in ``tests/gateway/test_api_server_routing.py`` cover:

  * Header sanitisation (CRLF rejection, length caps, whitespace).
  * Route resolution: matching, no-header fall-through, unmatched
    header fall-through, ``platform``-only catch-all, ``user_id`` and
    ``thread_id`` routes, route-order precedence.
  * Resilience: missing gateway reference, empty registry.
  * ContextVar isolation under ``asyncio.gather`` so two concurrent
    HTTP requests with different chat_ids stay isolated.

Refs: PR NousResearch#25660 (single-gateway multi-agent).
davidgut1982 added a commit to davidgut1982/hermes-agent that referenced this pull request Jun 2, 2026
The OpenAI-compatible HTTP adapter was the one inbound surface from
PR NousResearch#25660 that never called ``_attach_agent_id`` — every
``/v1/chat/completions``, ``/v1/responses``, and ``/v1/runs`` request
fell through to ``default_agent`` regardless of the configured routes,
silently undermining the multi-agent guarantee on any deployment that
exposes the API server.

Add a single routing entry point, ``_resolve_agent_profile``, that:

  * Reads ``X-Hermes-Chat-Id`` / ``X-Hermes-User-Id`` / ``X-Hermes-Thread-Id``
    from the request (sanitised through the same length + control-char
    caps as the existing ``X-Hermes-Session-Id`` / ``X-Hermes-Session-Key``).
  * Builds a synthetic ``SessionSource(platform=API_SERVER, …)`` and
    pipes it through the shared ``_attach_agent_id`` hook so declarative
    routes *and* the ``select_agent`` plugin hook fire identically to
    every other adapter.
  * Looks up the resolved ``agent_id`` in
    ``self._gateway_ref._agent_registry`` and returns the matching
    ``AgentProfile`` (or ``None`` for legacy single-agent installs).

The three agent-invoking handlers (chat completions, responses, runs)
now resolve the profile up front and bind it via ``use_profile`` for
the duration of the run.  Binding happens twice — once on the asyncio
side and once inside the executor thread — because asyncio's default
executor does not propagate ContextVars.

Behaviour is fully backward compatible: requests with no routing
headers (the existing OpenAI-API contract) resolve to
``default_agent``, exactly the current behaviour.

New tests in ``tests/gateway/test_api_server_routing.py`` cover:

  * Header sanitisation (CRLF rejection, length caps, whitespace).
  * Route resolution: matching, no-header fall-through, unmatched
    header fall-through, ``platform``-only catch-all, ``user_id`` and
    ``thread_id`` routes, route-order precedence.
  * Resilience: missing gateway reference, empty registry.
  * ContextVar isolation under ``asyncio.gather`` so two concurrent
    HTTP requests with different chat_ids stay isolated.

Refs: PR NousResearch#25660 (single-gateway multi-agent).
davidgut1982 added a commit to davidgut1982/hermes-agent that referenced this pull request Jun 2, 2026
The OpenAI-compatible HTTP adapter was the one inbound surface from
PR NousResearch#25660 that never called ``_attach_agent_id`` — every
``/v1/chat/completions``, ``/v1/responses``, and ``/v1/runs`` request
fell through to ``default_agent`` regardless of the configured routes,
silently undermining the multi-agent guarantee on any deployment that
exposes the API server.

Add a single routing entry point, ``_resolve_agent_profile``, that:

  * Reads ``X-Hermes-Chat-Id`` / ``X-Hermes-User-Id`` / ``X-Hermes-Thread-Id``
    from the request (sanitised through the same length + control-char
    caps as the existing ``X-Hermes-Session-Id`` / ``X-Hermes-Session-Key``).
  * Builds a synthetic ``SessionSource(platform=API_SERVER, …)`` and
    pipes it through the shared ``_attach_agent_id`` hook so declarative
    routes *and* the ``select_agent`` plugin hook fire identically to
    every other adapter.
  * Looks up the resolved ``agent_id`` in
    ``self._gateway_ref._agent_registry`` and returns the matching
    ``AgentProfile`` (or ``None`` for legacy single-agent installs).

The three agent-invoking handlers (chat completions, responses, runs)
now resolve the profile up front and bind it via ``use_profile`` for
the duration of the run.  Binding happens twice — once on the asyncio
side and once inside the executor thread — because asyncio's default
executor does not propagate ContextVars.

Behaviour is fully backward compatible: requests with no routing
headers (the existing OpenAI-API contract) resolve to
``default_agent``, exactly the current behaviour.

New tests in ``tests/gateway/test_api_server_routing.py`` cover:

  * Header sanitisation (CRLF rejection, length caps, whitespace).
  * Route resolution: matching, no-header fall-through, unmatched
    header fall-through, ``platform``-only catch-all, ``user_id`` and
    ``thread_id`` routes, route-order precedence.
  * Resilience: missing gateway reference, empty registry.
  * ContextVar isolation under ``asyncio.gather`` so two concurrent
    HTTP requests with different chat_ids stay isolated.

Refs: PR NousResearch#25660 (single-gateway multi-agent).
davidgut1982 added a commit to davidgut1982/hermes-agent that referenced this pull request Jun 3, 2026
The OpenAI-compatible HTTP adapter was the one inbound surface from
PR NousResearch#25660 that never called ``_attach_agent_id`` — every
``/v1/chat/completions``, ``/v1/responses``, and ``/v1/runs`` request
fell through to ``default_agent`` regardless of the configured routes,
silently undermining the multi-agent guarantee on any deployment that
exposes the API server.

Add a single routing entry point, ``_resolve_agent_profile``, that:

  * Reads ``X-Hermes-Chat-Id`` / ``X-Hermes-User-Id`` / ``X-Hermes-Thread-Id``
    from the request (sanitised through the same length + control-char
    caps as the existing ``X-Hermes-Session-Id`` / ``X-Hermes-Session-Key``).
  * Builds a synthetic ``SessionSource(platform=API_SERVER, …)`` and
    pipes it through the shared ``_attach_agent_id`` hook so declarative
    routes *and* the ``select_agent`` plugin hook fire identically to
    every other adapter.
  * Looks up the resolved ``agent_id`` in
    ``self._gateway_ref._agent_registry`` and returns the matching
    ``AgentProfile`` (or ``None`` for legacy single-agent installs).

The three agent-invoking handlers (chat completions, responses, runs)
now resolve the profile up front and bind it via ``use_profile`` for
the duration of the run.  Binding happens twice — once on the asyncio
side and once inside the executor thread — because asyncio's default
executor does not propagate ContextVars.

Behaviour is fully backward compatible: requests with no routing
headers (the existing OpenAI-API contract) resolve to
``default_agent``, exactly the current behaviour.

New tests in ``tests/gateway/test_api_server_routing.py`` cover:

  * Header sanitisation (CRLF rejection, length caps, whitespace).
  * Route resolution: matching, no-header fall-through, unmatched
    header fall-through, ``platform``-only catch-all, ``user_id`` and
    ``thread_id`` routes, route-order precedence.
  * Resilience: missing gateway reference, empty registry.
  * ContextVar isolation under ``asyncio.gather`` so two concurrent
    HTTP requests with different chat_ids stay isolated.

Refs: PR NousResearch#25660 (single-gateway multi-agent).
davidgut1982 added a commit to davidgut1982/hermes-agent that referenced this pull request Jun 3, 2026
The OpenAI-compatible HTTP adapter was the one inbound surface from
PR NousResearch#25660 that never called ``_attach_agent_id`` — every
``/v1/chat/completions``, ``/v1/responses``, and ``/v1/runs`` request
fell through to ``default_agent`` regardless of the configured routes,
silently undermining the multi-agent guarantee on any deployment that
exposes the API server.

Add a single routing entry point, ``_resolve_agent_profile``, that:

  * Reads ``X-Hermes-Chat-Id`` / ``X-Hermes-User-Id`` / ``X-Hermes-Thread-Id``
    from the request (sanitised through the same length + control-char
    caps as the existing ``X-Hermes-Session-Id`` / ``X-Hermes-Session-Key``).
  * Builds a synthetic ``SessionSource(platform=API_SERVER, …)`` and
    pipes it through the shared ``_attach_agent_id`` hook so declarative
    routes *and* the ``select_agent`` plugin hook fire identically to
    every other adapter.
  * Looks up the resolved ``agent_id`` in
    ``self._gateway_ref._agent_registry`` and returns the matching
    ``AgentProfile`` (or ``None`` for legacy single-agent installs).

The three agent-invoking handlers (chat completions, responses, runs)
now resolve the profile up front and bind it via ``use_profile`` for
the duration of the run.  Binding happens twice — once on the asyncio
side and once inside the executor thread — because asyncio's default
executor does not propagate ContextVars.

Behaviour is fully backward compatible: requests with no routing
headers (the existing OpenAI-API contract) resolve to
``default_agent``, exactly the current behaviour.

New tests in ``tests/gateway/test_api_server_routing.py`` cover:

  * Header sanitisation (CRLF rejection, length caps, whitespace).
  * Route resolution: matching, no-header fall-through, unmatched
    header fall-through, ``platform``-only catch-all, ``user_id`` and
    ``thread_id`` routes, route-order precedence.
  * Resilience: missing gateway reference, empty registry.
  * ContextVar isolation under ``asyncio.gather`` so two concurrent
    HTTP requests with different chat_ids stay isolated.

Refs: PR NousResearch#25660 (single-gateway multi-agent).
@vdruts

vdruts commented Jun 4, 2026

Copy link
Copy Markdown

+1 — strongly in favor of this landing. Adding a real-world data point:

I've been running exactly this architecture in OpenClaw for months: a single gateway process hosting 8 agents, each with its own Telegram bot token, personality, model config, and isolated memory. One process polls all 8 bots, routes inbound by bot/chat, and operationally it's one daemon to install, watch, and restart instead of eight.

I've started building agents in Hermes and want to migrate fully — but the one-gateway-per-profile model is the blocker. Recreating my setup today means 8 separate gateway services, 8 restart paths, and 8 chances for the PID/launchd races already reported elsewhere in the tracker. That's a hard sell when the single-gateway model demonstrably works at this scale day-to-day.

The design here (per-agent profile + declarative routes, zero behavior change for existing single-agent installs) maps 1:1 to how I'd consolidate. Happy to test this MVP against a real 8-bot Telegram fleet if useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder comp/cli CLI entry point, hermes_cli/, setup wizard comp/cron Cron scheduler and job management comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists platform/discord Discord bot adapter platform/feishu Feishu / Lark adapter platform/matrix Matrix adapter (E2EE) platform/slack Slack app adapter platform/telegram Telegram bot adapter platform/wecom WeCom / WeChat Work adapter type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants