Bug Description
When the primary provider's OAuth token expires (e.g. openai-codex with HTTP 401), fallback_model configured in config.yaml is never used. The agent returns a fixed ~78-char error message to Discord instead of falling back to the configured backup model.
Steps to Reproduce
- Configure
config.yaml with openai-codex as primary and a local model as fallback:
model:
provider: openai-codex
fallback_model:
provider: custom
model: mlx-community/gemma-4-e4b-it-4bit
base_url: http://localhost:8080/v1
api_key: no-key-required
- Let the Codex OAuth token expire (or revoke it)
- Send a message via Discord gateway
- Agent responds with a fixed error string instead of using the fallback model
Expected Behavior
When primary provider auth fails, Hermes should automatically switch to fallback_model (or fallback_providers chain) and respond normally, the same way it handles rate limits mid-conversation.
Actual Behavior
The agent returns a fixed ~78-char error response and logs:
INFO agent.credential_pool: credential pool: no available entries (all exhausted or empty)
ERROR cron.scheduler: Job failed: RuntimeError: Codex token refresh failed with status 401.
Traceback:
...
hermes_cli.auth.AuthError: Codex token refresh failed with status 401.
Affected Component
- Gateway (Telegram/Discord/Slack/WhatsApp)
Messaging Platform
Root Cause Analysis
The fallback mechanism has two separate layers:
Layer 1 — run_agent.py:_try_activate_fallback() (lines 4987–5125)
Handles fallback during API call errors (rate limits, HTTP errors mid-conversation). Works correctly.
Layer 2 — credential resolution (gateway/run.py:_resolve_runtime_agent_kwargs(), cron/scheduler.py:run_job())
This runs before run_agent.py is instantiated. When resolve_runtime_provider() raises AuthError, it propagates as a RuntimeError and Layer 1 is never reached.
The root is in hermes_cli/runtime_provider.py lines 707–709:
# openai-codex (same pattern for nous, qwen-oauth)
except AuthError:
if requested_provider != "auto":
raise # ← raises immediately when provider is explicitly set
# only falls through when provider == "auto"
When the provider is explicitly set in config.yaml (e.g. provider: openai-codex), requested_provider is "openai-codex" — not "auto" — so the AuthError is always re-raised before fallback_model is ever consulted.
Where the fix should go:
gateway/run.py:292 — _resolve_runtime_agent_kwargs()
# current: AuthError propagates as RuntimeError, no fallback
try:
runtime = resolve_runtime_provider(...)
except Exception as exc:
raise RuntimeError(format_runtime_provider_error(exc)) from exc
cron/scheduler.py:661 — run_job()
# current: same — fallback_model is read on line 687 but never used on auth failure
try:
runtime = resolve_runtime_provider(**runtime_kwargs)
except Exception as exc:
raise RuntimeError(message) from exc
fallback_model = _cfg.get("fallback_providers") or _cfg.get("fallback_model") # already here, unused on auth error
Proposed Fix
Catch AuthError specifically at both call sites and retry resolve_runtime_provider() with the fallback config before giving up. The fix is isolated to gateway/run.py and cron/scheduler.py — no changes needed in runtime_provider.py.
Key consideration: only catch AuthError (not all exceptions) to avoid silently swallowing real config errors (e.g. missing Anthropic API key should still surface clearly).
Operating System
macOS 25.4.0 (Darwin)
Python Version
3.11.15
Hermes Version
0.8.0 (2026.4.8)
Relevant Logs / Traceback
INFO gateway.run: inbound message: platform=discord user=taek chat=...
INFO agent.credential_pool: credential pool: no available entries (all exhausted or empty)
ERROR cron.scheduler: Job 'resource-monitor-home-discord' failed: RuntimeError: Codex token refresh failed with status 401.
File "hermes_cli/runtime_provider.py", line 697, in resolve_runtime_provider
creds = resolve_codex_runtime_credentials()
File "hermes_cli/auth.py", line 1471, in resolve_codex_runtime_credentials
tokens = _refresh_codex_auth_tokens(tokens, refresh_timeout_seconds)
File "hermes_cli/auth.py", line 1334, in refresh_codex_oauth_pure
raise AuthError: Codex token refresh failed with status 401.
INFO gateway.run: response ready: platform=discord time=5.3s api_calls=0 response=78 chars
Bug Description
When the primary provider's OAuth token expires (e.g.
openai-codexwith HTTP 401),fallback_modelconfigured inconfig.yamlis never used. The agent returns a fixed ~78-char error message to Discord instead of falling back to the configured backup model.Steps to Reproduce
config.yamlwithopenai-codexas primary and a local model as fallback:Expected Behavior
When primary provider auth fails, Hermes should automatically switch to
fallback_model(orfallback_providerschain) and respond normally, the same way it handles rate limits mid-conversation.Actual Behavior
The agent returns a fixed ~78-char error response and logs:
Affected Component
Messaging Platform
Root Cause Analysis
The fallback mechanism has two separate layers:
Layer 1 —
run_agent.py:_try_activate_fallback()(lines 4987–5125)Handles fallback during API call errors (rate limits, HTTP errors mid-conversation). Works correctly.
Layer 2 — credential resolution (
gateway/run.py:_resolve_runtime_agent_kwargs(),cron/scheduler.py:run_job())This runs before
run_agent.pyis instantiated. Whenresolve_runtime_provider()raisesAuthError, it propagates as aRuntimeErrorand Layer 1 is never reached.The root is in
hermes_cli/runtime_provider.pylines 707–709:When the provider is explicitly set in
config.yaml(e.g.provider: openai-codex),requested_provideris"openai-codex"— not"auto"— so theAuthErroris always re-raised beforefallback_modelis ever consulted.Where the fix should go:
gateway/run.py:292—_resolve_runtime_agent_kwargs()cron/scheduler.py:661—run_job()Proposed Fix
Catch
AuthErrorspecifically at both call sites and retryresolve_runtime_provider()with the fallback config before giving up. The fix is isolated togateway/run.pyandcron/scheduler.py— no changes needed inruntime_provider.py.Key consideration: only catch
AuthError(not all exceptions) to avoid silently swallowing real config errors (e.g. missing Anthropic API key should still surface clearly).Operating System
macOS 25.4.0 (Darwin)
Python Version
3.11.15
Hermes Version
0.8.0 (2026.4.8)
Relevant Logs / Traceback