Skip to content

Auxiliary tasks silently fall back to paid OpenRouter models, bypassing user's free-only configuration #24029

@sherman-yang

Description

@sherman-yang

Summary

Users who explicitly configure only OpenRouter :free models in
fallback_providers still get billed (or blocked by key monthly limits)
because auxiliary tasks (title_generation, compression, vision,
etc.) silently fall back to a hardcoded paid model
(google/gemini-3-flash-preview) when their primary provider fails.

Reproduce

~/.hermes/config.yaml:

model:
  default: moonshotai/kimi-k2.6
  provider: nvidia

fallback_providers:
  - provider: openrouter
    model: inclusionai/ring-2.6-1t:free   # explicitly :free
  # ...all :free models

auxiliary:
  title_generation:
    provider: auto                          # default

OPENROUTER_API_KEY is set (used by fallback_providers).

What Happens

  1. NVIDIA primary times out
  2. _resolve_auto() enters Step 2 (_try_payment_fallback)
  3. Reaches OpenRouter, uses hardcoded
    _OPENROUTER_MODEL = \"google/gemini-3-flash-preview\"
    (agent/auxiliary_client.py:391) — a PAID model
  4. Hits user's per-key monthly limit → HTTP 403
    Key limit exceeded (monthly limit)

Logs:

INFO agent.auxiliary_client: Auxiliary title_generation: connection error on auto
INFO agent.auxiliary_client: Auxiliary title_generation: ... falling back to openrouter (google/gemini-3-flash-preview)
WARNING agent.title_generator: Title generation failed: Error code: 403 - {'error': {'message': 'Key limit exceeded (monthly limit)...'}}

Root Cause

The auxiliary fallback chain is independent of the user's
fallback_providers config and ignores :free model variant constraints.
The default model per provider is a hardcoded constant
(_OPENROUTER_MODEL, _NOUS_MODEL), not derived from the user's
configured models.

Expected Behavior

When the user's fallback_providers list contains only :free models
for OpenRouter (or has documented their free-only intent), auxiliary
tasks should:

  • Use those same :free models, OR
  • Refuse to fall back to OpenRouter at all (current behavior of explicit
    provider: nvidia is the correct shape), OR
  • At minimum, log a clear warning that a paid model is being used

Suggested Fix

Option A (least invasive): when picking the aux fallback model for a
provider, prefer the user's fallback_providers[provider].model if any
entry exists for that provider, before falling back to the hardcoded
constant.

Option B: add a top-level auxiliary.free_only: true flag that filters
out paid defaults across all aux tasks.

Option C: document this trap prominently in
auxiliary.title_generation.provider and recommend explicit
provider: <main_provider> for free-tier users.

Environment

  • Hermes Agent v0.13.0 (2026.5.7)
  • macOS / Python 3.11
  • Primary: NVIDIA NIM (free)
  • Fallback: OpenRouter (`:free` only) with $10 credit but $0/low per-key
    monthly limit

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium — degraded but workaround existscomp/agentCore agent loop, run_agent.py, prompt builderprovider/openrouterOpenRouter aggregatortype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions