Skip to content

fix(desktop): scope in-session /model switch per-session, stop process-env leak#41120

Merged
teknium1 merged 2 commits into
mainfrom
hermes/hermes-f04ad83a
Jun 7, 2026
Merged

fix(desktop): scope in-session /model switch per-session, stop process-env leak#41120
teknium1 merged 2 commits into
mainfrom
hermes/hermes-f04ad83a

Conversation

@teknium1

@teknium1 teknium1 commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

Summary

Switching the model in one desktop/dashboard session no longer changes the model in other live sessions. Two same-profile sessions can now run different models concurrently (e.g. MiniMax M3 in one, GLM 5.1 in another) without interfering.

Root cause: the tui_gateway backend hosts every same-profile session in one process. An in-session /model switch wrote process-global env vars (HERMES_MODEL / HERMES_INFERENCE_MODEL / HERMES_TUI_PROVIDER / HERMES_INFERENCE_PROVIDER). _resolve_startup_runtime() reads those when building a fresh agent, so any sibling session's next rebuild (/new, resume) silently inherited the switched model/provider. The CLI never hit this because CLI = one session per process; the desktop broke that assumption by multiplexing sessions.

Changes

  • tui_gateway/server.py _apply_model_switch: record the switch as a per-session model_override on the session dict instead of mutating os.environ. agent.switch_model() already mutates the correct session's agent in place. --global persistence (_persist_model_switch) is unchanged.
  • tui_gateway/server.py _make_agent: accept model_override and prefer it over global/env resolution on rebuild — carrying the concrete base_url/api_key/api_mode the switch resolved. Falls back to _resolve_startup_runtime() (global config) when absent.
  • _reset_session_agent (/new) threads the session's model_override into the rebuild; _init_session seeds the key.
  • Cleaner fix for TUI: /new after /model switch to custom provider model fails with 'no API key' when model name exists in native provider catalog #16857 (/new after switching to a custom-provider model): the override carries resolved credentials, so the rebuild keeps the right endpoint without the leaky env vars.

Validation

Before After
/model switch in session B touches process env yes (4 vars) no
Sibling session A model after B switch changed to B's model unchanged
/new in a session reverts to global / sibling's leaked model keeps that session's chosen model
#16857 (/new after custom-provider switch) fixed via leaky env fixed via per-session override

E2E verified two in-process sessions: switching B leaves the process env and session A fully untouched; _make_agent honors the per-session override on rebuild and falls back to global config without one. tests/tui_gateway/ 105 passed (incl. 2 new regression tests).

Known limitation (out of scope): resuming a session in a fresh process still resolves from global config — per-session model is not yet persisted to state.db. This PR fixes the concurrent-live-session contamination only.

Reported via Twitter (@Da7_Tech).

Infographic

per-session-model-isolation

…s-env leak

The desktop/dashboard tui_gateway backend hosts every same-profile session
in ONE process. An in-session /model switch wrote process-global env vars
(HERMES_MODEL / HERMES_INFERENCE_MODEL / HERMES_TUI_PROVIDER /
HERMES_INFERENCE_PROVIDER), which _resolve_startup_runtime() reads when
building a fresh agent. So switching the model in one session leaked into
every other live session's next agent rebuild (/new, resume) — changing the
model in session B silently changed it in session A.

Fix: record the switch as a per-session model_override on the session dict
instead of mutating os.environ. _make_agent honors that override on rebuild
(carrying the concrete base_url/api_key/api_mode the switch resolved), and
falls back to global config when absent. Global persistence on the --global
flag is unchanged.

Also a cleaner fix for #16857 (/new after switching to a custom-provider
model): the override carries the resolved credentials, so the rebuild keeps
the right endpoint without relying on the leaky env vars.

Reported via Twitter (@Da7_Tech): MiniMax M3 in one session + GLM 5.1 in
another interfere when switching between them.
@github-actions

github-actions Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

🔎 Lint report: hermes/hermes-f04ad83a vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 10001 on HEAD, 9999 on base (🆕 +2)

🆕 New issues (1):

Rule Count
not-subscriptable 1
First entries
tests/tui_gateway/test_make_agent_provider.py:367: [not-subscriptable] not-subscriptable: Cannot subscript object of type `None` with no `__getitem__` method

✅ Fixed issues: none

Unchanged: 5186 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

…e contract

The three test_config_set_model_syncs_* tests asserted the old leaky contract
(switch writes HERMES_MODEL / HERMES_TUI_PROVIDER / HERMES_INFERENCE_PROVIDER to
process env). That env-sync IS the cross-session contamination bug this PR
removes. Updated to assert the new contract: shared process env untouched, the
switch recorded as a per-session model_override carrying provider/model/base_url/
api_key/api_mode. #16857's intent (a custom-provider switch survives /new) is
still covered — now via the override _make_agent honors on rebuild.

@tonydwb tonydwb left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

Verdict: Approved

Clean cross-session model isolation fix. The per-session override dict replaces process-global env mutation, preventing sibling sessions from being contaminated by an in-session /model switch.

What this review covers

  • Removed os.environ writes in _apply_model_switch
  • Persisted model_override on session dict
  • Propagated override through _reset_session_agent / _make_agent
  • Regression tests verify no env leak and override survives rebuild

Looks Good

  • Surgical code change with clear rationale (cross-session contamination)
  • Tests cover both halves of the bug: no env mutation AND override survives /new
  • _reset_session_agent already had the right hook; this just threads the data through
  • No secrets or debug artifacts
  • No blocking concerns

Reviewed by Hermes Agent

@alt-glitch alt-glitch added type/bug Something isn't working comp/tui Terminal UI (ui-tui/ + tui_gateway/) P3 Low — cosmetic, nice to have labels Jun 7, 2026
@teknium1 teknium1 merged commit 5a3092b into main Jun 7, 2026
23 checks passed
@teknium1 teknium1 deleted the hermes/hermes-f04ad83a branch June 7, 2026 09:33
agogo233 added a commit to agogo233/hermes-agent that referenced this pull request Jun 8, 2026
* upstream/main: (430 commits)
  fix(yuanbao): bound ws.close() so an idle server can't stall shutdown ~5s (NousResearch#40607)
  docs: add Urdu translation of README (NousResearch#40578)
  fix(hindsight): send only new-turn delta on append retains instead of whole session (NousResearch#40605)
  feat(gateway): render terminal tool calls as native bash code blocks on markdown platforms (NousResearch#41215)
  feat(desktop): stop the chat viewport from following streaming output (NousResearch#41414)
  chore(release): map AlchemistChaos co-author email for NousResearch#40135 salvage
  fix(desktop): recover chat after sleep/wake by revalidating a stale remote backend
  fix(web): make _has_env config-aware so SEARXNG_URL auto-detect honors Hermes config
  fix(web): honor Hermes config-aware SEARXNG_URL lookup
  install.sh: hint at root-owned npm cache when desktop npm install fails (NousResearch#39688)
  fix(tools): percent-encode non-ascii URL components
  fix(skills): browse shows full catalog, not first 5000 (NousResearch#41413)
  feat(desktop+gateway): remote media relay — attach images/PDFs and display gateway images over the network
  feat(desktop): full tool-backend config (pickers + per-backend settings) in Settings (NousResearch#41232)
  hardening(api-server): scan cron prompts on REST create/update for parity with the agent tool
  fix: skip MCP preflight content-type probe on reconnect when already ready (NousResearch#40604)
  fix(kanban): sweep deferred scratch parent on non-scratch child completion + tests
  fix: defer scratch workspace cleanup when task has active children (NousResearch#33774)
  feat(onboarding): opt-in structured profile-build path on first contact (NousResearch#41114)
  feat(compression): temporal anchoring in compaction summaries (NousResearch#41102)
  test(discord): align clarify/model-picker tests with fail-closed component auth (NousResearch#41338)
  chore(release): map Dusk1e and LaPhilosophie for approval fail-closed salvage (NousResearch#33844, NousResearch#33866, NousResearch#30964)
  fix(discord): fail closed for component button auth when no allowlist set
  fix(feishu): fail closed for update prompt card actions
  fix(slack): re-check gateway auth on approval and slash-confirm buttons
  fix: guard int(os.getenv()) casts against malformed env vars (NousResearch#40598)
  fix: respect Honcho env var fallback in doctor and honcho status
  chore(release): add synapsesx to AUTHOR_MAP for NousResearch#40495 salvage
  fix(research): keep tool_call/tool_response pairs intact when compressing trajectories
  fix(simplex): accept display name in SIMPLEX_ALLOWED_USERS
  fix(desktop): make the running-turn timer per-session (NousResearch#41182)
  test(approval): regression for shell-escape denylist bypass (NousResearch#36846, NousResearch#36847)
  fix(security): strip shell escapes in denylist normalizer; fail-closed on missing approval module
  fix(stream+output-cap): guard empty streams and parse OpenRouter output-cap errors (NousResearch#40589)
  fix(desktop): bootstrap falls back to installed agent install.sh on GitHub 404
  feat(dashboard): change UI font from the theme picker, independent of theme (NousResearch#41145)
  fix(cli): return bool (not None) when a destructive-slash confirmation is cancelled (NousResearch#40583)
  fix(desktop): preserve configured base_url on same-provider model switch (NousResearch#41121)
  fix(desktop): stop bare-URL autolinker swallowing trailing emphasis asterisks (NousResearch#41093)
  fix(cron): bound the desktop run-history query to one job (NousResearch#41088)
  fix(desktop): scope in-session /model switch per-session, stop process-env leak (NousResearch#41120)
  chore: map bmoore210 author email for PR NousResearch#40550 salvage
  fix(desktop): scope session list to active profile + longer timeout
  fix: harden gateway startup and turn persistence
  fix(computer_use): honor custom vision routing
  fix(aux): honor model.default_headers on auxiliary client too (NousResearch#40033)
  fix(agent): honor model.default_headers for custom OpenAI-compatible providers (NousResearch#40033)
  docs(i18n): port deep-audit corrections to zh-Hans mirror (NousResearch#41104)
  fix(compression): don't overwrite the -1 post-compression sentinel in preflight seed (NousResearch#36718)
  chore(release): map singhsanidhya741@gmail.com to sanidhyasin (NousResearch#41094)
  ...
changman pushed a commit to changman/hermes-agent that referenced this pull request Jun 10, 2026
…s-env leak (NousResearch#41120)

* fix(desktop): scope in-session /model switch per-session, stop process-env leak

The desktop/dashboard tui_gateway backend hosts every same-profile session
in ONE process. An in-session /model switch wrote process-global env vars
(HERMES_MODEL / HERMES_INFERENCE_MODEL / HERMES_TUI_PROVIDER /
HERMES_INFERENCE_PROVIDER), which _resolve_startup_runtime() reads when
building a fresh agent. So switching the model in one session leaked into
every other live session's next agent rebuild (/new, resume) — changing the
model in session B silently changed it in session A.

Fix: record the switch as a per-session model_override on the session dict
instead of mutating os.environ. _make_agent honors that override on rebuild
(carrying the concrete base_url/api_key/api_mode the switch resolved), and
falls back to global config when absent. Global persistence on the --global
flag is unchanged.

Also a cleaner fix for NousResearch#16857 (/new after switching to a custom-provider
model): the override carries the resolved credentials, so the rebuild keeps
the right endpoint without relying on the leaky env vars.

Reported via Twitter (@Da7_Tech): MiniMax M3 in one session + GLM 5.1 in
another interfere when switching between them.

* test(tui_gateway): align /model switch tests with per-session override contract

The three test_config_set_model_syncs_* tests asserted the old leaky contract
(switch writes HERMES_MODEL / HERMES_TUI_PROVIDER / HERMES_INFERENCE_PROVIDER to
process env). That env-sync IS the cross-session contamination bug this PR
removes. Updated to assert the new contract: shared process env untouched, the
switch recorded as a per-session model_override carrying provider/model/base_url/
api_key/api_mode. NousResearch#16857's intent (a custom-provider switch survives /new) is
still covered — now via the override _make_agent honors on rebuild.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/tui Terminal UI (ui-tui/ + tui_gateway/) P3 Low — cosmetic, nice to have type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants