Summary
Users who explicitly configure only OpenRouter :free models in
fallback_providers still get billed (or blocked by key monthly limits)
because auxiliary tasks (title_generation, compression, vision,
etc.) silently fall back to a hardcoded paid model
(google/gemini-3-flash-preview) when their primary provider fails.
Reproduce
~/.hermes/config.yaml:
model:
default: moonshotai/kimi-k2.6
provider: nvidia
fallback_providers:
- provider: openrouter
model: inclusionai/ring-2.6-1t:free # explicitly :free
# ...all :free models
auxiliary:
title_generation:
provider: auto # default
OPENROUTER_API_KEY is set (used by fallback_providers).
What Happens
- NVIDIA primary times out
_resolve_auto() enters Step 2 (_try_payment_fallback)
- Reaches OpenRouter, uses hardcoded
_OPENROUTER_MODEL = \"google/gemini-3-flash-preview\"
(agent/auxiliary_client.py:391) — a PAID model
- Hits user's per-key monthly limit → HTTP 403
Key limit exceeded (monthly limit)
Logs:
INFO agent.auxiliary_client: Auxiliary title_generation: connection error on auto
INFO agent.auxiliary_client: Auxiliary title_generation: ... falling back to openrouter (google/gemini-3-flash-preview)
WARNING agent.title_generator: Title generation failed: Error code: 403 - {'error': {'message': 'Key limit exceeded (monthly limit)...'}}
Root Cause
The auxiliary fallback chain is independent of the user's
fallback_providers config and ignores :free model variant constraints.
The default model per provider is a hardcoded constant
(_OPENROUTER_MODEL, _NOUS_MODEL), not derived from the user's
configured models.
Expected Behavior
When the user's fallback_providers list contains only :free models
for OpenRouter (or has documented their free-only intent), auxiliary
tasks should:
- Use those same
:free models, OR
- Refuse to fall back to OpenRouter at all (current behavior of explicit
provider: nvidia is the correct shape), OR
- At minimum, log a clear warning that a paid model is being used
Suggested Fix
Option A (least invasive): when picking the aux fallback model for a
provider, prefer the user's fallback_providers[provider].model if any
entry exists for that provider, before falling back to the hardcoded
constant.
Option B: add a top-level auxiliary.free_only: true flag that filters
out paid defaults across all aux tasks.
Option C: document this trap prominently in
auxiliary.title_generation.provider and recommend explicit
provider: <main_provider> for free-tier users.
Environment
- Hermes Agent v0.13.0 (2026.5.7)
- macOS / Python 3.11
- Primary: NVIDIA NIM (free)
- Fallback: OpenRouter (`:free` only) with $10 credit but $0/low per-key
monthly limit
Summary
Users who explicitly configure only OpenRouter
:freemodels infallback_providersstill get billed (or blocked by key monthly limits)because auxiliary tasks (
title_generation,compression,vision,etc.) silently fall back to a hardcoded paid model
(
google/gemini-3-flash-preview) when their primary provider fails.Reproduce
~/.hermes/config.yaml:OPENROUTER_API_KEYis set (used byfallback_providers).What Happens
_resolve_auto()enters Step 2 (_try_payment_fallback)_OPENROUTER_MODEL = \"google/gemini-3-flash-preview\"(
agent/auxiliary_client.py:391) — a PAID modelKey limit exceeded (monthly limit)Logs:
Root Cause
The auxiliary fallback chain is independent of the user's
fallback_providersconfig and ignores:freemodel variant constraints.The default model per provider is a hardcoded constant
(
_OPENROUTER_MODEL,_NOUS_MODEL), not derived from the user'sconfigured models.
Expected Behavior
When the user's
fallback_providerslist contains only:freemodelsfor OpenRouter (or has documented their free-only intent), auxiliary
tasks should:
:freemodels, ORprovider: nvidiais the correct shape), ORSuggested Fix
Option A (least invasive): when picking the aux fallback model for a
provider, prefer the user's
fallback_providers[provider].modelif anyentry exists for that provider, before falling back to the hardcoded
constant.
Option B: add a top-level
auxiliary.free_only: trueflag that filtersout paid defaults across all aux tasks.
Option C: document this trap prominently in
auxiliary.title_generation.providerand recommend explicitprovider: <main_provider>for free-tier users.Environment
monthly limit