Design discussion: multi-profile deployments in a single gateway process
Summary
Today a hermes gateway run --profile <name> process is bound to exactly
one Hermes profile, which means one model, one system prompt, one MEMORY.md,
one set of skills/tools. For deployments where an operator wants to run
multiple agent personalities in the same Hermes installation — each backed
by a different model or prompt, talking to its own platform endpoint — the
canonical answer today is "run N gateway processes, one per profile."
This works (and we're doing it), but it scales linearly in operational cost
(N supervisors, N ports, N tunnels, N memory footprints). At small N it's
fine; past ~5–10 it starts to bite.
This issue isn't asking for a PR — it's asking whether multi-profile-per-gateway
is on the roadmap, what shape you'd want it to take if so, or whether
"N gateway processes" is the canonical answer indefinitely.
Use case
We're shipping a Hermes integration with Microsoft Agent 365 (Hermes-A365).
An A365 "blueprint" is Microsoft's term for an agent definition: an Entra
app + service principal + bot messaging endpoint + permissions, registered
in a tenant. Each blueprint has its own identity and its own purpose.
A realistic enterprise deployment looks like:
| Blueprint |
Backing Hermes agent |
Hermes profile config |
| "Inbox Triage" |
Sonnet 4.6, email-tool surface |
profile: inbox-triage, port 3978 |
| "Calendar Concierge" |
Sonnet 4.6, calendar-tool surface |
profile: calendar-concierge, port 3979 |
| "Security Review Buddy" |
Opus 4.7, file-system + code-search tools |
profile: security-buddy, port 3980 |
| "Onboarding Helper" |
Haiku 4.5, narrow read-only skills |
profile: onboarding-helper, port 3981 |
Today this is 4 hermes gateway run processes. Each binds a port, each is
fronted by a tunnel (cloudflared) or reverse proxy, each is its own
process-supervisor entry.
Adapter-level multi-instance is already in Hermes
For reference: the framework already has adapters that handle multiple
instances of the same platform internally. gateway/platforms/slack.py:507
documents the multi-workspace pattern (comma-separated bot tokens, internal
routing); weixin.py keys state by account_id; matrix.py similar.
That pattern works great for same personality, many endpoints — one
Slack adapter, one Hermes agent, N workspaces feeding into it.
But it doesn't help with different personalities, many endpoints — which
is the multi-blueprint A365 case above and (we suspect) a class of use cases
for other platforms too: agencies running differentiated bots for different
clients in the same Hermes install, MSPs running per-tenant bots, etc.
Three deployment shapes (today's space)
-
N gateway processes, one profile each — today's canonical path.
Works at small N. Cost: N supervisor entries, N ports, N tunnels,
N memory footprints.
-
One gateway, multi-adapter for one platform — adapter-internal
multi-instance (Slack pattern). One Hermes agent backs all instances.
Useful for "same personality, many endpoints" but not the multi-blueprint
case above.
-
One gateway, multi-profile — what this issue is about. One process,
N profiles co-resident, each with its own agent loop. Adapter selects the
right profile based on incoming activity metadata (BF aaInstanceId,
Slack workspace ID, etc.).
Path 3 is the one that isn't possible today as far as we can tell. If we've
missed an existing mechanism, please redirect.
What we're asking
- Is multi-profile-in-one-gateway on Hermes' roadmap, or is "N processes"
the canonical answer indefinitely?
- If on the roadmap: what shape would you want it to take? Some axes:
~/.hermes/config.yaml representing multiple profiles inline, vs.
one config file per profile (today's shape) with the gateway loading N
of them.
- Profile selection: keyed by inbound metadata, by adapter, by routing
rule, or operator-pinned per platform?
- Session-store / memory isolation: presumably already correct since
SessionSource includes platform + chat_id, but worth confirming for
cross-profile guarantees.
- Plugin / hook scoping: do
pre_tool_call hooks fire per-profile or
globally? Backwards compatibility?
- If not on the roadmap: we'll lean further into path 1 + reverse proxy
fronting N bridges, and skip the design work. Useful to know.
Why we're asking now (not later)
We just shipped v0.1.x of Hermes-A365 to PyPI. The wrapper's register → publish → cleanup loop works end-to-end against a live tenant from a fresh
pip install. The 1-blueprint case is fully supported.
The first operator with a 5+ blueprint deployment is when path 1's
operational tax starts to land. We'd rather have the design conversation
asynchronously now than scramble when that operator surfaces. We're not
asking for a PR or commitment to ship — just a read on whether this is a
direction Hermes wants to grow in, and what shape would be acceptable if
we eventually contribute it.
Happy to spec further or open a discussion in whatever forum makes most
sense (issue, discussion, RFC doc).
— from the Hermes-A365 maintainers
Design discussion: multi-profile deployments in a single gateway process
Summary
Today a
hermes gateway run --profile <name>process is bound to exactlyone Hermes profile, which means one model, one system prompt, one MEMORY.md,
one set of skills/tools. For deployments where an operator wants to run
multiple agent personalities in the same Hermes installation — each backed
by a different model or prompt, talking to its own platform endpoint — the
canonical answer today is "run N gateway processes, one per profile."
This works (and we're doing it), but it scales linearly in operational cost
(N supervisors, N ports, N tunnels, N memory footprints). At small N it's
fine; past ~5–10 it starts to bite.
This issue isn't asking for a PR — it's asking whether multi-profile-per-gateway
is on the roadmap, what shape you'd want it to take if so, or whether
"N gateway processes" is the canonical answer indefinitely.
Use case
We're shipping a Hermes integration with Microsoft Agent 365 (Hermes-A365).
An A365 "blueprint" is Microsoft's term for an agent definition: an Entra
app + service principal + bot messaging endpoint + permissions, registered
in a tenant. Each blueprint has its own identity and its own purpose.
A realistic enterprise deployment looks like:
inbox-triage, port 3978calendar-concierge, port 3979security-buddy, port 3980onboarding-helper, port 3981Today this is 4
hermes gateway runprocesses. Each binds a port, each isfronted by a tunnel (cloudflared) or reverse proxy, each is its own
process-supervisor entry.
Adapter-level multi-instance is already in Hermes
For reference: the framework already has adapters that handle multiple
instances of the same platform internally.
gateway/platforms/slack.py:507documents the multi-workspace pattern (comma-separated bot tokens, internal
routing);
weixin.pykeys state byaccount_id;matrix.pysimilar.That pattern works great for same personality, many endpoints — one
Slack adapter, one Hermes agent, N workspaces feeding into it.
But it doesn't help with different personalities, many endpoints — which
is the multi-blueprint A365 case above and (we suspect) a class of use cases
for other platforms too: agencies running differentiated bots for different
clients in the same Hermes install, MSPs running per-tenant bots, etc.
Three deployment shapes (today's space)
N gateway processes, one profile each — today's canonical path.
Works at small N. Cost: N supervisor entries, N ports, N tunnels,
N memory footprints.
One gateway, multi-adapter for one platform — adapter-internal
multi-instance (Slack pattern). One Hermes agent backs all instances.
Useful for "same personality, many endpoints" but not the multi-blueprint
case above.
One gateway, multi-profile — what this issue is about. One process,
N profiles co-resident, each with its own agent loop. Adapter selects the
right profile based on incoming activity metadata (BF
aaInstanceId,Slack workspace ID, etc.).
Path 3 is the one that isn't possible today as far as we can tell. If we've
missed an existing mechanism, please redirect.
What we're asking
the canonical answer indefinitely?
~/.hermes/config.yamlrepresenting multiple profiles inline, vs.one config file per profile (today's shape) with the gateway loading N
of them.
rule, or operator-pinned per platform?
SessionSourceincludes platform + chat_id, but worth confirming forcross-profile guarantees.
pre_tool_callhooks fire per-profile orglobally? Backwards compatibility?
fronting N bridges, and skip the design work. Useful to know.
Why we're asking now (not later)
We just shipped v0.1.x of Hermes-A365 to PyPI. The wrapper's
register → publish → cleanuploop works end-to-end against a live tenant from a freshpip install. The 1-blueprint case is fully supported.The first operator with a 5+ blueprint deployment is when path 1's
operational tax starts to land. We'd rather have the design conversation
asynchronously now than scramble when that operator surfaces. We're not
asking for a PR or commitment to ship — just a read on whether this is a
direction Hermes wants to grow in, and what shape would be acceptable if
we eventually contribute it.
Happy to spec further or open a discussion in whatever forum makes most
sense (issue, discussion, RFC doc).
— from the Hermes-A365 maintainers