feat(api-server): X-Hermes-User-* / Chat-* headers for multi-user identity#24423
Open
gsskk wants to merge 3 commits into
Open
feat(api-server): X-Hermes-User-* / Chat-* headers for multi-user identity#24423gsskk wants to merge 3 commits into
gsskk wants to merge 3 commits into
Conversation
ce31ed7 to
9898ffd
Compare
This was referenced May 27, 2026
9898ffd to
16613d1
Compare
Author
|
Rebased onto current main (16613d1), mergeable again, gateway tests pass locally. Context: this is the inbound half — outbound is #27525, runtime peer mapping is #30077, CI still gated on external-contributor workflow approval. |
…ntity Add six optional headers on /v1/chat/completions, /v1/responses, /v1/runs mirroring AIAgent.__init__'s existing identity kwargs: X-Hermes-User-Id → user_id X-Hermes-User-Name → user_name X-Hermes-Chat-Id → chat_id X-Hermes-Chat-Name → chat_name X-Hermes-Chat-Type → chat_type X-Hermes-Thread-Id → thread_id These kwargs already flow through AIAgent (run_agent.py:1098-1103) and are threaded by native adapters via GatewayRunner._run_agent_task (gateway/run.py:14881-14888); they drive Honcho's runtime_user_peer_name resolution, per-user memory directories, session DB attribution, etc. api_server was the odd adapter out — this PR makes it symmetrical so a single Hermes process can serve multiple identified end-users (Open WebUI / LobeChat / LibreChat multi-user, chat-bot bridges via the OpenAI protocol, multi-tenant SaaS deployments). Also accepts OpenAI's standard top-level `user` body field as a fallback for X-Hermes-User-Id (header takes precedence) so vanilla OpenAI SDK clients work without custom-header config. Mirrors fe8560f (NousResearch#20199, X-Hermes-Session-Key) — additive, opt-in, no breaking changes; unpatched clients see no difference. Same auth gate as the session-key header (silently ignored without API_SERVER_KEY, matching X-Hermes-Session-Key posture). All 6 headers + the body-fallback flag are advertised at /v1/capabilities.features so clients can feature-detect, and echoed in the response so clients can confirm what the server saw. Refs: NousResearch#22714 (the outgoing-side counterpart, in active discussion),
gateway/run.py:_run_agent_via_proxy previously forwarded only X-Hermes-Session-Id when a native adapter delegates to remote api_server (introduced in 90c9834). User identity from SessionSource (user_id, user_name, chat_id, chat_name, chat_type, thread_id) was dropped at the proxy boundary, even though the source has all of it. After this commit, those fields are forwarded as the new X-Hermes-User-* / Chat-* / Thread-Id headers when set on the source, so the remote api_server can reconstruct the SessionSource subset and pass it to its own AIAgent. Depends on the parser added in the previous commit.
…r setup
- features/api-server.md
- Document the new X-Hermes-User-* / Chat-* / Thread-Id headers under
POST /v1/chat/completions (mirrors the existing X-Hermes-Session-Key
documentation style)
- Update /v1/capabilities example to show the new *_header advertise
fields and the user_body_fallback flag
- Restructure "Multi-User Setup" section into two clearly-labeled
strategies: per-request identity headers (Strategy A — new) and
profile-per-user (Strategy B — existing, formerly the only option)
- Note that strategies compose for multi-tenant SaaS scenarios
- messaging/open-webui.md
- Restructure "Multi-User Setup" to present both strategies; profile
isolation becomes Strategy B (kept verbatim)
- Add Strategy A for shared-instance multi-user via reverse proxy /
Open WebUI Pipeline / the OpenAI `user` body fallback
16613d1 to
d4215f7
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds 6 optional headers to
api_servermirroringAIAgent.__init__'s existing identity kwargs, so a single Hermes process can serve multiple end-users distinguished per-request:X-Hermes-User-Iduser_idX-Hermes-User-Nameuser_nameX-Hermes-Chat-Idchat_idX-Hermes-Chat-Namechat_nameX-Hermes-Chat-Typechat_typeX-Hermes-Thread-Idthread_idAlso supports OpenAI's standard top-level
userbody field as a fallback forX-Hermes-User-Id(header wins if both present), so vanilla OpenAI SDK clients work without custom header config.Headers apply to all three agent endpoints:
/v1/chat/completions,/v1/responses,/v1/runs. Headers are echoed back on the response just likeX-Hermes-Session-Keyalready is. Capabilities are advertised at/v1/capabilitiesso clients can feature-detect.100% backward-compatible: requests without the new headers behave identically (same fallthrough to single-peer-per-session that native adapters bypass via the same kwargs).
Why
AIAgentalready acceptsuser_id/user_name/chat_id/chat_name/chat_type/thread_id(run_agent.py:1097-1102). All native gateway adapters (Telegram, Discord, Slack, Matrix, …) thread these viaGatewayRunner._run_agent_task(gateway/run.py:14881-14888). They drive:runtime_user_peer_nameresolution (run_agent.py:1967-1978→plugins/memory/honcho/session.py:304-305), enabling per-user peer cards in multi-peer-shared-session deploymentsuser_idfilter (fix: enforce per-user memory/session isolation for multi-user gateway #17989 in flight)~/.hermes/memories/{chat_id}/directory layout (also fix: enforce per-user memory/session isolation for multi-user gateway #17989)api_serveris the only adapter not threading these. After #20199 addedX-Hermes-Session-Key, this is the next natural addition.The impact today
The current api_server docs (
website/docs/user-guide/features/api-server.md§ Multi-User Setup with Profiles) recommend one Hermes process per user for multi-user deployments. That works but doesn't scale and blocks shared-session-multi-peer scenarios. With these headers, a single process can serve Open WebUI multi-user, LibreChat multi-tenant, chat-bot bridges (nonebot, custom Discord/Slack bridges via api_server), proxy-mode delegations, etc.Hermes-internal beneficiary: proxy mode
gateway/run.py:_run_agent_via_proxycurrently forwards onlyX-Hermes-Session-Id(lines 13899-13903) — user identity is dropped at the proxy boundary. This PR also patches proxy mode to forward the 6 new headers when present in the source, fixing a latent gap in Hermes's own dogfood.Scope
gateway/platforms/api_server.py—_parse_user_headers()+ thread kwargs through_create_agent/_run_agentfor chat_completions/responses/runs + capabilities advertise + echo on responsegateway/run.py— proxy mode header forwardingtests/gateway/test_api_server.py— mirror existingtest_session_key_*templatewebsite/docs/user-guide/features/api-server.md— document new headers + update Multi-User Setup sectionTotal ~255 LOC, no new dependencies, no breaking changes, no migrations.
Prior art / related work
X-Hermes-Session-Keyheader for memory scoping — direct precedent, same shape, same_parse_*_headerstylesession_idplumbed throughbuild_api_kwargs_extrasfor Grok cache affinity — established the "identity flows through agent transport" pattern on the outgoing sideextra_body.hermesfor downstream dispatcher orchestration — this PR is the incoming complement to thatX-Hermes-Merchant-Idfor tenant isolation — adjacent and compatible (tenant boundary + per-user identity within tenant)Field-naming alignment
chat_id(notroom_id) matches Hermes'sSessionSource.chat_id(gateway/session.py:84) andbuild_session_keyoutput (agent:main:{platform}:group:{chat_id}:...). #22714 also moves to this naming for consistency.What this PR does NOT include
command_originfield from Matrix gateway: no in-band channel to drive per-message LLM orchestration in a downstream dispatcher #22714's proposal — that's outgoing-side, doesn't belong hereX-Hermes-Merchant-Idfrom feat(api_server): per-merchant identity headers for multi-tenant deployments #12054 — different concern (tenant), separate PRAPI_SERVER_KEYgate, same asX-Hermes-Session-KeyTest plan
userbody field used asX-Hermes-User-Idfallback; header wins when both present/v1/capabilitiesadvertises all 6 new*_headercapabilitiesCompatibility
Use cases this unlocks
#90c98345c): forwarded calls no longer lose user identity at the api_server boundaryRefs: #20199 (direct precedent), #22714 (outgoing-side counterpart), #12054, #17989, #15598, #14984