fix: preserve credential pool through /model session override#19064
fix: preserve credential pool through /model session override#19064Cyrene963 wants to merge 2 commits into
Conversation
The /model command (both CLI and gateway) resolves credential pools via load_pool() but the result was silently dropped in the ModelSwitchResult → session-override chain. ModelSwitchResult has no credential_pool field, so the pool object was lost. After a /model switch to a pool-backed provider (e.g. 'custom:mimo-sgp-friend'), the agent was created with credential_pool=None, making _recover_with_credential_pool() a no-op. On 429/402 errors the agent retried the same exhausted key 3 times instead of rotating to the next available credential. Fix: - gateway/run.py: In the _resolve_model_and_runtime fast path, resolve the pool from the provider name when missing. In _handle_model_command, store the pool in the session override and pass it to in-place agent.switch_model(). - cli.py: In both _apply_model_switch_result and _handle_model_switch, resolve the pool and pass it to agent.switch_model(). - run_agent.py: Add credential_pool parameter to switch_model() and update self._credential_pool when provided.
|
Likely duplicate of #16701 — same root cause: ModelSwitchResult missing credential_pool field, gateway/CLI do not propagate pool through /model switch. |
1 similar comment
|
Likely duplicate of #16701 — same root cause: ModelSwitchResult missing credential_pool field, gateway/CLI do not propagate pool through /model switch. |
adcfcf1 to
64d80fd
Compare
When a gateway or CLI session uses /model to switch providers, the new provider's credential_pool was silently dropped — ModelSwitchResult had no such field, so switch_model() discarded what resolve_runtime_provider returned. The agent was created with credential_pool=None, making _recover_with_credential_pool() a no-op on 429/402 errors. Fix (combining the approach from NousResearch#16701 with CLI coverage): - hermes_cli/model_switch.py: Add credential_pool field to ModelSwitchResult, capture it from resolve_runtime_provider() in both the explicit-provider and same-provider re-resolve paths. - gateway/run.py: Propagate credential_pool from the result into session overrides and in-place agent.switch_model(). Update _session_model_overrides type hint from Dict[str, str] to Dict[str, Any] to accommodate the CredentialPool instance. - cli.py: Both _apply_model_switch_result and _handle_model_switch pass result.credential_pool to agent.switch_model() and update self._credential_pool. - run_agent.py: Accept credential_pool parameter in switch_model() and update self._credential_pool when provided. Based on the analysis in NousResearch#16701 (briandevans). Closes NousResearch#16678.
|
Thanks for flagging! Yes, same root cause as #16701 (briandevans). I've coordinated with the #16701 author — we adopted their approach (adding Key differences from #16701:
I commented on #16701 to offer collaboration — happy to rebase if #16701 merges first, or they can pick up the CLI changes from here. Either way the goal is one clean fix for #16678. |
|
@alt-glitch Updated the PR -- adopted #16701's approach (added credential_pool to ModelSwitchResult at the source) and extended it with CLI path coverage. What #19064 adds beyond #16701:
Commented on #16701 to coordinate with briandevans. Happy to rebase or close if #16701 merges first. |
|
Closing as duplicate of #16701 (briandevans). Their approach of adding The local patch remains active in Cyrene963/hermes-patches until #16701 is merged upstream. |
|
Re-evaluating closure status for #19064 I closed this as a duplicate earlier, but I rechecked the referenced upstream PR(s) and none of them are merged yet:
Because the underlying fix does not appear to have landed upstream, closing this solely as a duplicate may have been premature. I am reopening this PR so it can remain trackable unless maintainers prefer a different canonical PR. |
Summary
Fix credential pool not being passed through after a
/modelsession override, causing key rotation to silently fail on 429/402 errors.Root cause:
ModelSwitchResulthad nocredential_poolfield. When/modelresolved a pool-backed provider (e.g.custom:mimo-sgp-friend) viaresolve_runtime_provider(), the pool object was returned in the runtime dict but silently dropped when building the result. The gateway's session override and CLI's in-place swap both received the incomplete set. The agent was created withcredential_pool=None, making_recover_with_credential_pool()a no-op — it retried the same exhausted key 3 times instead of rotating.Impact: Any user with a
custom:<name>credential pool who switches providers via/model(CLI or gateway) loses pool rotation until the next full restart.Changes
hermes_cli/model_switch.py—ModelSwitchResult+switch_model():credential_poolfield toModelSwitchResultresolve_runtime_provider()on both explicit-provider and same-provider re-resolve pathsgateway/run.py—_handle_model_command():result.credential_poolinto session overrides and in-placeagent.switch_model()_session_model_overridestype hint toDict[str, Any]for theCredentialPoolinstancecli.py—_apply_model_switch_result()and_handle_model_switch():result.credential_pooltoagent.switch_model()and updateself._credential_poolrun_agent.py—switch_model():credential_pool=Noneparameter, updateself._credential_poolwhen providedRelation to #16701
This PR adopts the approach from #16701 (briandevans) — adding
credential_pooltoModelSwitchResultas the proper source-level fix — and extends it with CLI path coverage that #16701 missed:cli.py:5476(_apply_model_switch_result) — interactive picker pathcli.py:5700(_handle_model_switch) — main/modelcommand handlerBoth call
agent.switch_model()without passing the pool, so a CLI/modelswitch to acustom:<name>provider also loses rotation.Test Plan
load_pool('custom:mimo-sgp-friend')returns a valid poolresolve_runtime_provider()returnscredential_poolin runtime dict/modelswitch now rotates on 429