Skip to content

Auxiliary fallback chain should reuse fallback_providers, not maintain a separate hardcoded list #24039

@sherman-yang

Description

@sherman-yang

The Design Issue

Hermes maintains two parallel fallback systems that don't know about each other:

User's fallback_providers (config.yaml):       Hardcoded aux fallback chain:
┌────────────────────────────────┐             ┌───────────────────────────────┐
│ 1. nvidia/ring-2.6-1t:free     │             │ 1. openrouter →               │
│ 2. nvidia/deepseek-v4-pro      │  ←ignored→  │    google/gemini-3-flash-      │
│ 3. nvidia/glm-5.1              │  each other │    preview (PAID)             │
│ 4. nvidia/minimax-m2.7         │             │ 2. nous → same paid model     │
│ ...                            │             │ 3. custom                     │
└────────────────────────────────┘             │ 4. api-key providers          │
         ↑                                     └───────────────────────────────┘
   Used by main agent                                    ↑
                                                Used by every aux task
                                                (compression, vision, title,
                                                 web_extract, curator, etc.)

When the main provider fails, the main agent walks the user's
fallback_providers
. But every auxiliary task (compression,
title_generation, vision, web_extract, session_search,
skills_hub, approval, mcp, triage_specifier, curator) walks a
separate, hardcoded fallback chain with hardcoded default models —
mostly paid models like google/gemini-3-flash-preview,
claude-haiku-4-5, glm-4.5-flash, etc.

This is a violation of the single-source-of-truth principle: the user
configured their fallback chain in one place, and most users assume
that's the only place the agent will look.

Why This Matters

Free-tier users who explicitly configure only :free models in
fallback_providers still get charged (or hit per-key spend limits)
because aux tasks invisibly use a paid default. The user has no way to
discover this short of reading the source code.

Even paid users are affected: a user who carefully picked a budget-
friendly fallback chain will see aux tasks silently use a more
expensive model they didn't choose.

Code Evidence

agent/auxiliary_client.py:1823-1841:

def _get_provider_chain() -> List[tuple]:
    return [
        (\"openrouter\", _try_openrouter),
        (\"nous\", _try_nous),
        (\"local/custom\", _try_custom_endpoint),
        (\"api-key\", _resolve_api_key_provider),
    ]

agent/auxiliary_client.py:391-392:

_OPENROUTER_MODEL = \"google/gemini-3-flash-preview\"   # paid
_NOUS_MODEL = \"google/gemini-3-flash-preview\"          # paid

plugins/model-providers/*/__init__.py: every provider's
default_aux_model is a paid model.

Nowhere in this fallback path does the code read the user's
fallback_providers from config.

Existing Comment Acknowledges Step 1 But Not Step 2

agent/auxiliary_client.py:2451-2457:

"auto" means "use my main chat model for side tasks as well" — no
surprise switches to a cheap fallback model for side tasks.

The comment frames Step 1 as preventing surprise model switches, but
Step 2's surprise paid-model switch goes unaddressed.

Proposed Fix

When Step 1 (main provider) fails for an aux task, walk the user's
fallback_providers list — same order, same models the user picked —
before consulting the hardcoded aux chain. The hardcoded chain
remains as a last-resort default for users with no fallback_providers
configured.

def _resolve_fallback(failed_provider, task):
    # 1. Honor user's fallback_providers first
    for entry in user_fallback_providers:
        if entry.provider == failed_provider:
            continue
        client = try_build_client(entry.provider, entry.model)
        if client: return client, entry.model

    # 2. Hardcoded chain only if user didn't configure anything
    if not user_fallback_providers:
        return _try_payment_fallback(failed_provider, task)

    return None

This makes fallback_providers the single source of truth for the
entire agent (main + aux), and respects users who deliberately picked
free-only models.

Related

See #24029 for the specific symptom (free-only users getting billed via
aux fallback). This issue addresses the underlying design.

Environment

  • Hermes Agent v0.13.0 (2026.5.7)
  • Affects all users with fallback_providers set + auxiliary.*.provider: auto

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium — degraded but workaround existsarea/configConfig system, migrations, profilescomp/agentCore agent loop, run_agent.py, prompt buildertype/featureNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions