Route messages to per-profile workers behind one bot token by banditburai · Pull Request #36872 · NousResearch/hermes-agent

banditburai · 2026-06-01T15:39:18Z

Summary

A single bot token exposes exactly one Hermes profile today. Serving several isolated identities (SOUL, memory, skills, sessions, model) requires one process and one credential per profile, because the token is single-consumer (Telegram getUpdates allows one consumer; Discord is one gateway WS per token) and HERMES_HOME plus several import-frozen module constants (auth paths, skills/hooks dirs) are process-global — so multiple profiles cannot share one process without cross-profile state bleed.

This routes each inbound message to a target profile by context (channel / topic / thread / guild / DM / @mention). Routing is platform-agnostic — it runs in the shared adapter seam, so it applies to every gateway platform (Telegram, Discord, Slack, Feishu, Matrix, …), not one. A transport-owning front process holds the token and runs a tokenless worker (hermes -p <name> gateway run, api_server only) per routed profile; isolation comes from the worker being a separate process with its own HERMES_HOME. Unrouted messages run in-process on the host profile, unchanged.

Change

Route resolver — resolve_profile_route (gateway/routing.py:50) scores each route via _route_score (gateway/routing.py:13): thread_id(8) > chat_id/channel_id(4) > guild_id/user_id(2) > platform(1); channel_id is an alias for chat_id; a thread with no own route matches its parent_chat_id. No match → None (host).
Config — _validate_profile_routing (gateway/config.py:454) rejects an unknown/invalid/duplicate-signature route at load; an absent table leaves profile_routing=None.
Session keys — build_session_key takes a profile arg (gateway/session.py:600); prefix = f"agent:{profile}" (gateway/session.py:629). The default "main" reproduces the existing agent:main:… keys byte-for-byte.
Tier-1 overlays — resolve_channel_model (gateway/platforms/base.py:1599) resolves a per-channel model from any platform's channel_models config (a session /model override still wins). Telegram additionally reads per-topic model/system_prompt from its group_topics/dm_topics config, scoped by chat_id so they do not collide across groups reusing a thread_id; other platforms express the same intent through channel_models/channel_prompts keyed by channel id.
Routing seam (platform-agnostic) — resolution runs in the shared base.handle_message (gateway/platforms/base.py:3628), keyed on SessionSource fields (platform/chat_id/channel_id/thread_id/guild_id/user_id), so every adapter reaching that seam routes with no per-platform code — Telegram, Slack, Feishu, Matrix, WhatsApp, Signal, and the Discord plugin among them; a platform routes on whichever of those fields it populates. On resolution error it falls back to host (no isolation boundary crossed yet). _maybe_dispatch_routed (gateway/run.py:8829) returns True on every matched path — rate-limited, success, and error (which posts a visible message) — and False only when unrouted.
Worker process — _worker_run_args_for_profile (hermes_cli/gateway.py:626) sets HERMES_GATEWAY_ONLY_PLATFORMS=api_server and never --replace; _only_platforms_filter (gateway/run.py:1558) skips token adapters in a worker so the front keeps the single token.
Pool — WorkerPool.acquire (gateway/worker_pool.py:135) spawns/probes/reuses; _default_interlock (gateway/worker_pool.py:79) refuses to spawn while get_running_pid(<profile>/gateway.pid) is live; idle-evict, crash circuit-break with cooldown, _maintain_worker_pool reap/sweep (gateway/run.py:8692), and shutdown() teardown on stop.
Dispatch — WorkerClient.dispatch (gateway/worker_client.py:49) posts /v1/runs, relays message.delta/response.media, and resolves approval.request; continue_session loads the worker's own transcript via _continue_session_id (gateway/platforms/api_server.py:1334), off by default.
Rate limit — ProfileRateLimiter (gateway/rate_limit.py:32) is a per-profile token bucket checked before dispatch; /new and /reset bypass it.
Media — files cross the front↔worker boundary as media_ref tokens, never a path; confine_to_safe_root (gateway/media_spool.py:34) confines resolution to the spool root.
Manual control — parse_profile_mention (gateway/chat_bindings.py:21) splits a leading @<name> <body> and routes the turn without mutating the persisted /profile binding.

Unchanged: unrouted traffic, default-profile session keys, and existing /v1/runs clients (continue_session defaults off).

Behavior

Input	Resolved by	Outcome
`@<name>` + existing profile	`parse_profile_mention` → `profile_exists`	this turn routes to worker `<name>`; binding unchanged
persisted `/profile` binding	`ChatBindings`	routes to the bound worker
config route match	`resolve_profile_route`	routes to `route.profile`
no match, `default: null`	`resolve_profile_route` → `None`	in-process host profile
resolution error	`handle_message` fallback	host profile (pre-isolation, fail-soft)
dispatch error after a match	`_maybe_dispatch_routed`	visible error to the user; no host fallback

Testing

uv run --extra dev pytest tests/gateway/test_{profile_routing,profile_routing_config,session_key_profile,worker_arg_builder,only_platforms_env,worker_pool,worker_client,routed_dispatch,profile_rate_limit,routed_reset_scope,routed_dm_and_thread_rules,channel_model,channel_model_routing,config_channel_models_bridge,telegram_topic_overlay,telegram_dm_topic_prompt,media_refs_inbound,response_media_outbound,media_auth_and_cleanup,profile_command,at_mention_override,worker_session_continuity,routed_worker_live}.py -q → 117 passed.

Resolution: test_profile_routing.py::test_exact_thread_wins_over_channel, ::test_thread_inherits_parent_channel, ::test_channel_beats_guild, ::test_non_dict_routes_ignored.
Byte-identical default keys: test_session_key_profile.py::test_default_profile_keys_unchanged.
Routing seam: test_routed_dispatch.py::test_unrouted_returns_false (unrouted → host), ::test_dispatch_failure_is_fail_closed (a dispatch error returns True and posts a visible message).
Pool: test_worker_pool.py::test_standalone_pid_refuses_spawn (interlock), ::test_lazy_respawn_records_crash_without_reap, ::test_circuit_breaker_recovers_after_cooldown, ::test_acquire_after_shutdown_is_refused.
Rate limit: test_profile_rate_limit.py::test_throttled_profile_is_not_dispatched, ::test_control_commands_bypass_rate_limit.
Continuity opt-in stays off for existing clients: test_worker_session_continuity.py::test_off_by_default_so_existing_clients_stay_stateless.
Manual control: test_at_mention_override.py::test_email_like_text_is_not_a_mention, ::test_mention_does_not_mutate_binding.
DM/thread isolation: test_routed_dm_and_thread_rules.py::test_distinct_dm_peers_get_distinct_keys_under_same_profile, ::test_thread_without_own_route_inherits_parent_channel_profile.
Real-socket front↔worker round-trip — test_routed_worker_live.py drives the real WorkerClient against a real APIServerAdapter on a loopback TestServer: ::test_real_socket_dispatch_roundtrip (POST /v1/runs → SSE → run.completed), ::test_real_socket_continue_session_rehydrates (worker reloads its transcript from state.db), ::test_real_socket_outbound_media_emitted (MEDIA: → response.media, tag-free text), ::test_real_socket_reset_unknown_session_is_idempotent.

The model turn is stubbed (no live LLM call in CI) and the worker runs in-process rather than as a spawned subprocess; the spawn argv/env (test_worker_arg_builder.py) and route resolution (test_profile_routing.py) are covered separately.

Limitations

A routed reply is delivered once the turn completes; deltas are not streamed (_FrontConsumer.on_delta is a no-op).
A routed turn that requests tool approval is denied and the user is notified; interactive approval is not forwarded across the boundary.
Media resolution uses a shared spool directory, so the front and worker must share a filesystem; the wire payload is metadata-only.

feat(telegram): route forum topics to Hermes profiles #18510 routes Telegram forum topics to profiles by resolving the target profile's home/config in-process. This change routes through separate worker processes; the media-ref path confinement (confine_to_safe_root, gateway/media_spool.py:34) adapts the safe_root helper from that PR.
feat: One gateway for multiple agent profiles. #24914 exposes multiple profiles from one gateway via a per-turn HERMES_HOME ContextVar swap, with /profile and @<name> controls. This change keeps those controls (gateway/chat_bindings.py:21) and isolates via worker processes.
Design discussion: multi-profile deployments in a single gateway process #23735 is the design discussion for one-gateway multi-profile deployments.
[Feature]: Feishu: Support per-group Profile routing with shared Skills #13633 asks for per-group routing on Feishu (which this change provides through the shared base.handle_message seam, gateway/platforms/feishu.py:2876) together with a skill library shared across profiles; the shared-library part is served by the existing skills.external_dirs config and is not part of this change.

Closes #4321
Closes #4622
Closes #5195
Closes #8339
Closes #9514
Closes #10143
Closes #18423
Closes #19809
Closes #24913
Addresses #13633
Addresses #7517
Supersedes #18510
Supersedes #24914
Refs #23735

…or Telegram

…ssion /model still wins)

…pics (NousResearch#5195)

…ct-wins, thread→parent inheritance)

…dentical keys)

… adapter filter

…p interlock)

…trip

…d when unrouted; fail-closed dispatch guard)

…#9514)

…18423)

…ousResearch#4321 NousResearch#10143 NousResearch#13633)

…earch#18510; credit ayoahha, Donmeusi)

…f-heal + closed-pool race guard The pool's evict/circuit-break machinery only ran on the next inbound message. Record a crash on the on-demand respawn path in acquire() (a dead SERVING/PROBING worker is counted before respawn) so the breaker trips even when no periodic reap observed the death. _broken becomes a per-profile cooldown map (was a permanent set) that self-heals after broken_cooldown. shutdown() now latches _closed under the lock before snapshotting so an acquire() racing shutdown can't spawn an orphan child.

…set idempotency, control-cmd rate-limit, approval notice - Wire pool upkeep: _maintain_worker_pool (reap+sweep) runs each _session_expiry_watcher tick; _worker_pool.shutdown() in _stop_impl so routed workers don't leak as orphans on gateway stop/restart. - Session continuity: routed turns were amnesiac (conversation_loop never auto-loads by session_id). Add opt-in continue_session — worker_client sends the flag, /v1/runs rehydrates the worker's own transcript via _continue_session_id. OFF by default => existing /v1/runs clients byte-identical. - Reset idempotency: /new or /reset before the first routed turn 404'd; reset_session + _aiohttp_delete now treat 404 as success. - Rate limit: /new and /reset carry no model cost — exempt them from the per-profile token bucket so a reset can't be throttled. - Approvals: a routed turn needing approval was silently auto-denied; _dispatch_to_worker now denies WITH a visible user notice.

`is_voice or (ext in _AUDIO_EXTS and is_voice)` is just `is_voice`; behavior unchanged (audio still maps to voice on the next branch).

…on, media, reset) Wire the real WorkerClient to a real APIServerAdapter on an aiohttp TestServer (a real listening socket), stubbing only the model turn, to exercise the actual POST /v1/runs -> SSE relay -> run.completed path, continue_session transcript rehydration from state.db, outbound MEDIA: -> response.media, and idempotent reset over the wire.

The Tier-1 channel_models overlay runs on every _run_agent call, including lean runners (cron/codex paths) that build GatewayRunner without setting self.config. Read it via getattr so a missing config skips the overlay instead of raising AttributeError (regressed tests/cron/test_codex_execution_paths.py).

tonydwb

Code Review Summary

Verdict: Approved

✅ Looks Good

Clean, well-architected feature implementing per-profile worker routing behind one bot token
Strong isolation model: each profile runs in its own hermes gateway run worker with only api_server, sharing zero memory with the front
WorkerPool with proper lifecycle states (SPAWNING → PROBING → SERVING → DRAINING → REAPED/UNHEALTHY), idle eviction, and crash circuit breaker
WorkerClient uses HTTP+SSE over existing /v1/runs API — no new transport needed
MediaSpool with claim-check refs: front mints refs for inbound media, worker materializes them — clean cross-process boundary
ProfileRateLimiter (token bucket) prevents one chatty profile from starving others
ChatBindings persisted in JSON file under sessions dir — survives restarts
Route resolution precedence: @mention > /profile binding > config routing — clear and composable
session_key changes from agent:main:... to agent:<profile>:... — namespaced correctly per profile
Fail-closed: routed message errors are reported to the user and never silently fall back to host profile
Channel model routing via channel_models and resolve_channel_model — scoped by chat_id, collision-free
.gitignore for the media spool and bindings files — but verify these are included

💡 Suggestions

The approval handler in _dispatch_to_worker currently fails closed with auto-deny — consider wiring interactive approval relay as a follow-up so routed profiles can use approval-gated tools
ProfileRateLimiter defaults (capacity=20, refill=1/sec) should be documented as configurable in production
The confine_to_safe_root path escape guard in media_spool.py is good — consider adding a test for path traversal attempts
For a PR of this scale (2846 additions), consider splitting into smaller reviewable chunks in future (e.g., media spool separately, worker pool separately)

Reviewed by Hermes Agent

banditburai added 24 commits June 1, 2026 11:36

C1: add resolve_channel_model per-route model resolver

40a3b9f

C1: bridge channel_models for all platforms; channel_skill_bindings f…

659b393

…or Telegram

C1: apply per-route channel_models override in runtime resolution (se…

db394fd

…ssion /model still wins)

C1: Telegram group_topics per-topic model + unified skill parity

77ec864

C1: per-topic system_prompt overlay applies to dm_topics and group_to…

d8b6efe

…pics (NousResearch#5195)

C2: add gateway.routing.resolve_profile_route (platform-agnostic, exa…

e641987

…ct-wins, thread→parent inheritance)

C2: load + fail-closed validate gateway.profile_routing

c321896

C2: parameterize build_session_key profile prefix (default 'main' = i…

23751fc

…dentical keys)

C2: worker arg builder (no --replace) + HERMES_GATEWAY_ONLY_PLATFORMS…

6d9a879

… adapter filter

C2: front-owned worker pool (spawn/probe/idle-evict/respawn + one-loo…

23220ea

…p interlock)

C2: worker_client — POST /v1/runs, relay SSE deltas + approval round-…

35ab4e9

…trip

C2: route inbound messages to per-profile workers (host path unchange…

cb0651d

…d when unrouted; fail-closed dispatch guard)

C2: per-profile token-bucket rate limit before dispatch (NousResearch…

aca5251

…#9514)

C2: scope /new and /reset to the routed worker session (NousResearch#…

b184e70

…18423)

C2: regression guards for per-peer DM keys + thread→parent routing

18a782c

C3: inbound media_refs claim-check on /v1/runs

943c8ee

C3: outbound response.media SSE + front upload via existing send_* (N…

4f83947

…ousResearch#4321 NousResearch#10143 NousResearch#13633)

C3: media-ref auth + cleanup + safe_root confinement (harvest NousRes…

856ec31

…earch#18510; credit ayoahha, Donmeusi)

C4: /profile command — list + persisted per-chat binding

28cdd64

C4: @mention per-turn profile override (binding untouched)

56b7318

C5: media_spool — drop dead sub-clause in kind_for

cf81c88

`is_voice or (ext in _AUDIO_EXTS and is_voice)` is just `is_voice`; behavior unchanged (audio still maps to voice on the next branch).

alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have comp/gateway Gateway runner, session dispatch, delivery platform/telegram Telegram bot adapter labels Jun 1, 2026

tonydwb approved these changes Jun 1, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Route messages to per-profile workers behind one bot token#36872

Route messages to per-profile workers behind one bot token#36872
banditburai wants to merge 25 commits into
NousResearch:mainfrom
banditburai:feat/profile-routing

banditburai commented Jun 1, 2026 •

edited

Loading

Uh oh!

tonydwb left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

banditburai commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Change

Behavior

Testing

Limitations

Related

Uh oh!

tonydwb left a comment

Choose a reason for hiding this comment

Code Review Summary

✅ Looks Good

💡 Suggestions

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

banditburai commented Jun 1, 2026 •

edited

Loading