Skip to content

fix: preserve credential pool through /model session override#19064

Open
Cyrene963 wants to merge 2 commits into
NousResearch:mainfrom
Cyrene963:fix/credential-pool-model-switch
Open

fix: preserve credential pool through /model session override#19064
Cyrene963 wants to merge 2 commits into
NousResearch:mainfrom
Cyrene963:fix/credential-pool-model-switch

Conversation

@Cyrene963

@Cyrene963 Cyrene963 commented May 3, 2026

Copy link
Copy Markdown

Summary

Fix credential pool not being passed through after a /model session override, causing key rotation to silently fail on 429/402 errors.

Root cause: ModelSwitchResult had no credential_pool field. When /model resolved a pool-backed provider (e.g. custom:mimo-sgp-friend) via resolve_runtime_provider(), the pool object was returned in the runtime dict but silently dropped when building the result. The gateway's session override and CLI's in-place swap both received the incomplete set. The agent was created with credential_pool=None, making _recover_with_credential_pool() a no-op — it retried the same exhausted key 3 times instead of rotating.

Impact: Any user with a custom:<name> credential pool who switches providers via /model (CLI or gateway) loses pool rotation until the next full restart.

Changes

hermes_cli/model_switch.pyModelSwitchResult + switch_model():

  • Add credential_pool field to ModelSwitchResult
  • Capture pool from resolve_runtime_provider() on both explicit-provider and same-provider re-resolve paths

gateway/run.py_handle_model_command():

  • Propagate result.credential_pool into session overrides and in-place agent.switch_model()
  • Update _session_model_overrides type hint to Dict[str, Any] for the CredentialPool instance

cli.py_apply_model_switch_result() and _handle_model_switch():

  • Pass result.credential_pool to agent.switch_model() and update self._credential_pool

run_agent.pyswitch_model():

  • Accept credential_pool=None parameter, update self._credential_pool when provided

Relation to #16701

This PR adopts the approach from #16701 (briandevans) — adding credential_pool to ModelSwitchResult as the proper source-level fix — and extends it with CLI path coverage that #16701 missed:

  • cli.py:5476 (_apply_model_switch_result) — interactive picker path
  • cli.py:5700 (_handle_model_switch) — main /model command handler

Both call agent.switch_model() without passing the pool, so a CLI /model switch to a custom:<name> provider also loses rotation.

Test Plan

  • Verified load_pool('custom:mimo-sgp-friend') returns a valid pool
  • Verified resolve_runtime_provider() returns credential_pool in runtime dict
  • Gateway restart + Telegram /model switch now rotates on 429
  • All 4 modified files compile cleanly
  • CI passes (no behavioral change for non-pool providers)

The /model command (both CLI and gateway) resolves credential pools via
load_pool() but the result was silently dropped in the ModelSwitchResult →
session-override chain.  ModelSwitchResult has no credential_pool field,
so the pool object was lost.  After a /model switch to a pool-backed
provider (e.g. 'custom:mimo-sgp-friend'), the agent was created with
credential_pool=None, making _recover_with_credential_pool() a no-op.
On 429/402 errors the agent retried the same exhausted key 3 times
instead of rotating to the next available credential.

Fix:
- gateway/run.py: In the _resolve_model_and_runtime fast path, resolve
  the pool from the provider name when missing.  In _handle_model_command,
  store the pool in the session override and pass it to in-place
  agent.switch_model().
- cli.py: In both _apply_model_switch_result and _handle_model_switch,
  resolve the pool and pass it to agent.switch_model().
- run_agent.py: Add credential_pool parameter to switch_model() and
  update self._credential_pool when provided.
@alt-glitch alt-glitch added type/bug Something isn't working comp/agent Core agent loop, run_agent.py, prompt builder comp/gateway Gateway runner, session dispatch, delivery comp/cli CLI entry point, hermes_cli/, setup wizard area/auth Authentication, OAuth, credential pools P2 Medium — degraded but workaround exists labels May 3, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Likely duplicate of #16701 — same root cause: ModelSwitchResult missing credential_pool field, gateway/CLI do not propagate pool through /model switch.

1 similar comment
@alt-glitch

Copy link
Copy Markdown
Collaborator

Likely duplicate of #16701 — same root cause: ModelSwitchResult missing credential_pool field, gateway/CLI do not propagate pool through /model switch.

When a gateway or CLI session uses /model to switch providers, the new
provider's credential_pool was silently dropped — ModelSwitchResult had
no such field, so switch_model() discarded what resolve_runtime_provider
returned.  The agent was created with credential_pool=None, making
_recover_with_credential_pool() a no-op on 429/402 errors.

Fix (combining the approach from NousResearch#16701 with CLI coverage):

- hermes_cli/model_switch.py: Add credential_pool field to
  ModelSwitchResult, capture it from resolve_runtime_provider() in
  both the explicit-provider and same-provider re-resolve paths.

- gateway/run.py: Propagate credential_pool from the result into
  session overrides and in-place agent.switch_model().  Update
  _session_model_overrides type hint from Dict[str, str] to
  Dict[str, Any] to accommodate the CredentialPool instance.

- cli.py: Both _apply_model_switch_result and _handle_model_switch
  pass result.credential_pool to agent.switch_model() and update
  self._credential_pool.

- run_agent.py: Accept credential_pool parameter in switch_model()
  and update self._credential_pool when provided.

Based on the analysis in NousResearch#16701 (briandevans).  Closes NousResearch#16678.
@Cyrene963

Copy link
Copy Markdown
Author

Thanks for flagging! Yes, same root cause as #16701 (briandevans).

I've coordinated with the #16701 author — we adopted their approach (adding credential_pool to ModelSwitchResult at the source) and extended it with CLI coverage that #16701 missed.

Key differences from #16701:

  • CLI path: fix(gateway): propagate credential_pool through /model session overrides (#16678) #16701 only fixed the gateway (gateway/run.py). CLI's _apply_model_switch_result and _handle_model_switch in cli.py had the same bug — both call agent.switch_model() without passing the pool. This PR fixes both paths.
  • run_agent.py: Added credential_pool parameter to AIAgent.switch_model() so in-place swaps update the agent's _credential_pool.
  • Cleaner: 4 files, +23/-48 net lines. All pool resolution happens once in model_switch.py.

I commented on #16701 to offer collaboration — happy to rebase if #16701 merges first, or they can pick up the CLI changes from here. Either way the goal is one clean fix for #16678.

@Cyrene963

Copy link
Copy Markdown
Author

@alt-glitch Updated the PR -- adopted #16701's approach (added credential_pool to ModelSwitchResult at the source) and extended it with CLI path coverage.

What #19064 adds beyond #16701:

  • cli.py: Both _apply_model_switch_result and _handle_model_switch now propagate the pool
  • run_agent.py: switch_model() accepts and applies credential_pool
  • 4 files, +23/-48 lines net

Commented on #16701 to coordinate with briandevans. Happy to rebase or close if #16701 merges first.

@Cyrene963

Copy link
Copy Markdown
Author

Closing as duplicate of #16701 (briandevans). Their approach of adding credential_pool to ModelSwitchResult at the source is more comprehensive. We've adopted their approach locally with additional CLI path coverage.

The local patch remains active in Cyrene963/hermes-patches until #16701 is merged upstream.

@Cyrene963 Cyrene963 closed this May 4, 2026
@Cyrene963 Cyrene963 reopened this May 25, 2026
@Cyrene963

Copy link
Copy Markdown
Author

Re-evaluating closure status for #19064

I closed this as a duplicate earlier, but I rechecked the referenced upstream PR(s) and none of them are merged yet:

Because the underlying fix does not appear to have landed upstream, closing this solely as a duplicate may have been premature. I am reopening this PR so it can remain trackable unless maintainers prefer a different canonical PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/auth Authentication, OAuth, credential pools comp/agent Core agent loop, run_agent.py, prompt builder comp/cli CLI entry point, hermes_cli/, setup wizard comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants