Skip to content

Design discussion: multi-profile deployments in a single gateway process #23735

@satscryption

Description

@satscryption

Design discussion: multi-profile deployments in a single gateway process

Summary

Today a hermes gateway run --profile <name> process is bound to exactly
one Hermes profile, which means one model, one system prompt, one MEMORY.md,
one set of skills/tools. For deployments where an operator wants to run
multiple agent personalities in the same Hermes installation — each backed
by a different model or prompt, talking to its own platform endpoint — the
canonical answer today is "run N gateway processes, one per profile."

This works (and we're doing it), but it scales linearly in operational cost
(N supervisors, N ports, N tunnels, N memory footprints). At small N it's
fine; past ~5–10 it starts to bite.

This issue isn't asking for a PR — it's asking whether multi-profile-per-gateway
is on the roadmap, what shape you'd want it to take if so, or whether
"N gateway processes" is the canonical answer indefinitely.

Use case

We're shipping a Hermes integration with Microsoft Agent 365 (Hermes-A365).
An A365 "blueprint" is Microsoft's term for an agent definition: an Entra
app + service principal + bot messaging endpoint + permissions, registered
in a tenant. Each blueprint has its own identity and its own purpose.

A realistic enterprise deployment looks like:

Blueprint Backing Hermes agent Hermes profile config
"Inbox Triage" Sonnet 4.6, email-tool surface profile: inbox-triage, port 3978
"Calendar Concierge" Sonnet 4.6, calendar-tool surface profile: calendar-concierge, port 3979
"Security Review Buddy" Opus 4.7, file-system + code-search tools profile: security-buddy, port 3980
"Onboarding Helper" Haiku 4.5, narrow read-only skills profile: onboarding-helper, port 3981

Today this is 4 hermes gateway run processes. Each binds a port, each is
fronted by a tunnel (cloudflared) or reverse proxy, each is its own
process-supervisor entry.

Adapter-level multi-instance is already in Hermes

For reference: the framework already has adapters that handle multiple
instances of the same platform internally. gateway/platforms/slack.py:507
documents the multi-workspace pattern (comma-separated bot tokens, internal
routing); weixin.py keys state by account_id; matrix.py similar.

That pattern works great for same personality, many endpoints — one
Slack adapter, one Hermes agent, N workspaces feeding into it.

But it doesn't help with different personalities, many endpoints — which
is the multi-blueprint A365 case above and (we suspect) a class of use cases
for other platforms too: agencies running differentiated bots for different
clients in the same Hermes install, MSPs running per-tenant bots, etc.

Three deployment shapes (today's space)

  1. N gateway processes, one profile each — today's canonical path.
    Works at small N. Cost: N supervisor entries, N ports, N tunnels,
    N memory footprints.

  2. One gateway, multi-adapter for one platform — adapter-internal
    multi-instance (Slack pattern). One Hermes agent backs all instances.
    Useful for "same personality, many endpoints" but not the multi-blueprint
    case above.

  3. One gateway, multi-profile — what this issue is about. One process,
    N profiles co-resident, each with its own agent loop. Adapter selects the
    right profile based on incoming activity metadata (BF aaInstanceId,
    Slack workspace ID, etc.).

Path 3 is the one that isn't possible today as far as we can tell. If we've
missed an existing mechanism, please redirect.

What we're asking

  • Is multi-profile-in-one-gateway on Hermes' roadmap, or is "N processes"
    the canonical answer indefinitely?
  • If on the roadmap: what shape would you want it to take? Some axes:
    • ~/.hermes/config.yaml representing multiple profiles inline, vs.
      one config file per profile (today's shape) with the gateway loading N
      of them.
    • Profile selection: keyed by inbound metadata, by adapter, by routing
      rule, or operator-pinned per platform?
    • Session-store / memory isolation: presumably already correct since
      SessionSource includes platform + chat_id, but worth confirming for
      cross-profile guarantees.
    • Plugin / hook scoping: do pre_tool_call hooks fire per-profile or
      globally? Backwards compatibility?
  • If not on the roadmap: we'll lean further into path 1 + reverse proxy
    fronting N bridges, and skip the design work. Useful to know.

Why we're asking now (not later)

We just shipped v0.1.x of Hermes-A365 to PyPI. The wrapper's register → publish → cleanup loop works end-to-end against a live tenant from a fresh
pip install. The 1-blueprint case is fully supported.

The first operator with a 5+ blueprint deployment is when path 1's
operational tax starts to land. We'd rather have the design conversation
asynchronously now than scramble when that operator surfaces. We're not
asking for a PR or commitment to ship — just a read on whether this is a
direction Hermes wants to grow in, and what shape would be acceptable if
we eventually contribute it.

Happy to spec further or open a discussion in whatever forum makes most
sense (issue, discussion, RFC doc).

— from the Hermes-A365 maintainers

Metadata

Metadata

Assignees

No one assigned

    Labels

    P3Low — cosmetic, nice to havearea/configConfig system, migrations, profilescomp/gatewayGateway runner, session dispatch, deliverytype/featureNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions