Summary
When the primary model fails and Hermes activates fallback_model, the runtime correctly switches the live agent to the fallback model/provider, but the CLI/session indicator continues to show the originally configured primary model.
This is misleading during quota exhaustion or provider failures, because the user is no longer talking to the model shown in the UI.
Repro
- Configure a primary model and a fallback model.
- Exhaust or rate-limit the primary provider.
- Start a Hermes CLI session.
- Trigger a request that causes fallback activation.
Example runtime output:
Rate limited — switching to fallback provider...
Primary model failed — switching to fallback: qwen/qwen3.6-plus via openrouter
Actual
The CLI/session indicator still shows the configured primary model (for example GLM-5.1), even though the response is being generated by the fallback model.
Expected
Once fallback is activated for the turn/session, the visible "current model" indicator should reflect the actually active runtime model/provider.
At minimum the UI should distinguish:
- configured primary model
- currently active runtime model
Evidence
The runtime fallback path mutates the live agent state in place:
run_agent.py _try_activate_fallback() at lines around 4804+
- sets
self.model = fb_model
- sets
self.provider = fb_provider
- emits fallback status at lines around 4913+
The CLI still renders from its own self.model:
cli.py status bar uses CLI-owned self.model around 3040+
cli.py Current: output uses CLI-owned self.model around 3961+
- after
run_conversation() returns, the CLI updates history and response but does not sync self.model / self.provider back from self.agent around 6694+
Likely Fix
After run_conversation() returns, sync CLI-visible model/provider from self.agent.model and self.agent.provider, or expose an explicit active_model / active_provider field from the agent result and render that in the status/session UI.
Summary
When the primary model fails and Hermes activates
fallback_model, the runtime correctly switches the live agent to the fallback model/provider, but the CLI/session indicator continues to show the originally configured primary model.This is misleading during quota exhaustion or provider failures, because the user is no longer talking to the model shown in the UI.
Repro
Example runtime output:
Rate limited — switching to fallback provider...Primary model failed — switching to fallback: qwen/qwen3.6-plus via openrouterActual
The CLI/session indicator still shows the configured primary model (for example
GLM-5.1), even though the response is being generated by the fallback model.Expected
Once fallback is activated for the turn/session, the visible "current model" indicator should reflect the actually active runtime model/provider.
At minimum the UI should distinguish:
Evidence
The runtime fallback path mutates the live agent state in place:
run_agent.py_try_activate_fallback()at lines around 4804+self.model = fb_modelself.provider = fb_providerThe CLI still renders from its own
self.model:cli.pystatus bar uses CLI-ownedself.modelaround 3040+cli.pyCurrent:output uses CLI-ownedself.modelaround 3961+run_conversation()returns, the CLI updates history and response but does not syncself.model/self.providerback fromself.agentaround 6694+Likely Fix
After
run_conversation()returns, sync CLI-visible model/provider fromself.agent.modelandself.agent.provider, or expose an explicitactive_model/active_providerfield from the agent result and render that in the status/session UI.