Skip to content

feat(api-server): X-Hermes-User-* / Chat-* headers for multi-user identity#24423

Open
gsskk wants to merge 3 commits into
NousResearch:mainfrom
gsskk:feat/api-server-user-identity-headers
Open

feat(api-server): X-Hermes-User-* / Chat-* headers for multi-user identity#24423
gsskk wants to merge 3 commits into
NousResearch:mainfrom
gsskk:feat/api-server-user-identity-headers

Conversation

@gsskk

@gsskk gsskk commented May 12, 2026

Copy link
Copy Markdown

Summary

Adds 6 optional headers to api_server mirroring AIAgent.__init__'s existing identity kwargs, so a single Hermes process can serve multiple end-users distinguished per-request:

Header AIAgent kwarg
X-Hermes-User-Id user_id
X-Hermes-User-Name user_name
X-Hermes-Chat-Id chat_id
X-Hermes-Chat-Name chat_name
X-Hermes-Chat-Type chat_type
X-Hermes-Thread-Id thread_id

Also supports OpenAI's standard top-level user body field as a fallback for X-Hermes-User-Id (header wins if both present), so vanilla OpenAI SDK clients work without custom header config.

Headers apply to all three agent endpoints: /v1/chat/completions, /v1/responses, /v1/runs. Headers are echoed back on the response just like X-Hermes-Session-Key already is. Capabilities are advertised at /v1/capabilities so clients can feature-detect.

100% backward-compatible: requests without the new headers behave identically (same fallthrough to single-peer-per-session that native adapters bypass via the same kwargs).

Why

AIAgent already accepts user_id / user_name / chat_id / chat_name / chat_type / thread_id (run_agent.py:1097-1102). All native gateway adapters (Telegram, Discord, Slack, Matrix, …) thread these via GatewayRunner._run_agent_task (gateway/run.py:14881-14888). They drive:

api_server is the only adapter not threading these. After #20199 added X-Hermes-Session-Key, this is the next natural addition.

The impact today

The current api_server docs (website/docs/user-guide/features/api-server.md § Multi-User Setup with Profiles) recommend one Hermes process per user for multi-user deployments. That works but doesn't scale and blocks shared-session-multi-peer scenarios. With these headers, a single process can serve Open WebUI multi-user, LibreChat multi-tenant, chat-bot bridges (nonebot, custom Discord/Slack bridges via api_server), proxy-mode delegations, etc.

Hermes-internal beneficiary: proxy mode

gateway/run.py:_run_agent_via_proxy currently forwards only X-Hermes-Session-Id (lines 13899-13903) — user identity is dropped at the proxy boundary. This PR also patches proxy mode to forward the 6 new headers when present in the source, fixing a latent gap in Hermes's own dogfood.

Scope

Change LOC
gateway/platforms/api_server.py_parse_user_headers() + thread kwargs through _create_agent/_run_agent for chat_completions/responses/runs + capabilities advertise + echo on response ~80
gateway/run.py — proxy mode header forwarding ~25
tests/gateway/test_api_server.py — mirror existing test_session_key_* template ~120
website/docs/user-guide/features/api-server.md — document new headers + update Multi-User Setup section ~30

Total ~255 LOC, no new dependencies, no breaking changes, no migrations.

Prior art / related work

Field-naming alignment

chat_id (not room_id) matches Hermes's SessionSource.chat_id (gateway/session.py:84) and build_session_key output (agent:main:{platform}:group:{chat_id}:...). #22714 also moves to this naming for consistency.

What this PR does NOT include

Test plan

  • Unit: 6 new headers parsed, validated (control chars rejected, length capped at 256, auth required), threaded to AIAgent kwargs
  • Unit: OpenAI user body field used as X-Hermes-User-Id fallback; header wins when both present
  • Unit: absent headers → kwargs unset → existing single-peer fallthrough preserved
  • Unit: echo-back in response headers
  • Unit: /v1/capabilities advertises all 6 new *_header capabilities
  • Unit: proxy mode forwards headers when source has them
  • Integration: existing Honcho integration test confirmed per-user peer card formation (manual, needs Honcho instance)

Compatibility

  • All existing tests pass
  • No API surface changes for unmodified callers
  • New headers are opt-in
  • Unknown header values are silently ignored (validation only rejects control-char injection and oversized values)
  • Old api_server clients sending no new headers → unchanged behavior
  • New clients targeting old api_server → headers ignored gracefully

Use cases this unlocks

  1. Open WebUI / LobeChat / LibreChat multi-user: one Hermes serves many web-UI users with per-user Honcho peer cards (currently requires one Hermes process per user)
  2. Chat-bot bridges via api_server (nonebot, custom Discord/Slack/Telegram non-native bridges, WhatsApp via Twilio): single Hermes serves many users in many groups with proper memory scoping
  3. Proxy mode (#90c98345c): forwarded calls no longer lose user identity at the api_server boundary
  4. Multi-tenant SaaS Hermes deployments: one process scales to many end-users, identity-aware per request

Refs: #20199 (direct precedent), #22714 (outgoing-side counterpart), #12054, #17989, #15598, #14984

@gsskk

gsskk commented Jun 4, 2026

Copy link
Copy Markdown
Author

Rebased onto current main (16613d1), mergeable again, gateway tests pass locally.

Context: this is the inbound half — outbound is #27525, runtime peer mapping is #30077,
all sharing the chat_id/user_id/chat_type vocabulary. See #32863 for the cleanup map.

CI still gated on external-contributor workflow approval.

gsskk added 3 commits June 13, 2026 23:33
…ntity

Add six optional headers on /v1/chat/completions, /v1/responses, /v1/runs
mirroring AIAgent.__init__'s existing identity kwargs:

  X-Hermes-User-Id    → user_id
  X-Hermes-User-Name  → user_name
  X-Hermes-Chat-Id    → chat_id
  X-Hermes-Chat-Name  → chat_name
  X-Hermes-Chat-Type  → chat_type
  X-Hermes-Thread-Id  → thread_id

These kwargs already flow through AIAgent (run_agent.py:1098-1103) and
are threaded by native adapters via GatewayRunner._run_agent_task
(gateway/run.py:14881-14888); they drive Honcho's runtime_user_peer_name
resolution, per-user memory directories, session DB attribution, etc.
api_server was the odd adapter out — this PR makes it symmetrical so a
single Hermes process can serve multiple identified end-users (Open
WebUI / LobeChat / LibreChat multi-user, chat-bot bridges via the
OpenAI protocol, multi-tenant SaaS deployments).

Also accepts OpenAI's standard top-level `user` body field as a fallback
for X-Hermes-User-Id (header takes precedence) so vanilla OpenAI SDK
clients work without custom-header config.

Mirrors fe8560f (NousResearch#20199, X-Hermes-Session-Key) — additive, opt-in,
no breaking changes; unpatched clients see no difference. Same auth gate
as the session-key header (silently ignored without API_SERVER_KEY,
matching X-Hermes-Session-Key posture).

All 6 headers + the body-fallback flag are advertised at
/v1/capabilities.features so clients can feature-detect, and echoed in
the response so clients can confirm what the server saw.

Refs: NousResearch#22714 (the outgoing-side counterpart, in active discussion),
gateway/run.py:_run_agent_via_proxy previously forwarded only
X-Hermes-Session-Id when a native adapter delegates to remote api_server
(introduced in 90c9834).  User identity from SessionSource (user_id,
user_name, chat_id, chat_name, chat_type, thread_id) was dropped at the
proxy boundary, even though the source has all of it.

After this commit, those fields are forwarded as the new X-Hermes-User-*
/ Chat-* / Thread-Id headers when set on the source, so the remote
api_server can reconstruct the SessionSource subset and pass it to its
own AIAgent.

Depends on the parser added in the previous commit.
…r setup

- features/api-server.md
  - Document the new X-Hermes-User-* / Chat-* / Thread-Id headers under
    POST /v1/chat/completions (mirrors the existing X-Hermes-Session-Key
    documentation style)
  - Update /v1/capabilities example to show the new *_header advertise
    fields and the user_body_fallback flag
  - Restructure "Multi-User Setup" section into two clearly-labeled
    strategies: per-request identity headers (Strategy A — new) and
    profile-per-user (Strategy B — existing, formerly the only option)
  - Note that strategies compose for multi-tenant SaaS scenarios

- messaging/open-webui.md
  - Restructure "Multi-User Setup" to present both strategies; profile
    isolation becomes Strategy B (kept verbatim)
  - Add Strategy A for shared-instance multi-user via reverse proxy /
    Open WebUI Pipeline / the OpenAI `user` body fallback
@gsskk gsskk force-pushed the feat/api-server-user-identity-headers branch from 16613d1 to d4215f7 Compare June 13, 2026 13:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery P3 Low — cosmetic, nice to have type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants