Feature Description
Each auxiliary task (vision, web_extract, compression, session_search, skills_hub, mcp, flush_memories) should accept an optional fallback_providers list, mirroring the existing model.fallback_providers pattern at the top level. When the primary provider:model fails, the auxiliary client would walk the fallback list before exhausting.
Motivation
The auxiliary client currently offers two resolution modes, neither of which combines determinism with resilience:
-
Explicit provider:model — deterministic model selection, zero fallback. If the chosen provider is down or rate-limited, the task fails silently. Users who need specific models for specific tasks (e.g., vision needs a multimodal model, compression wants a fast cheap model) have no resilience path.
-
auto — delegates to a hardcoded resolution chain (_get_provider_chain() in agent/auxiliary_client.py): main provider -> OpenRouter -> Nous -> custom endpoint -> API-key providers. No user control over order, model selection, or which providers participate. Codex is deliberately excluded.
Users who want both deterministic model selection per-task AND cross-provider fallback have no way to achieve it today.
Proposed Solution
Extend the auxiliary task config blocks with an optional fallback_providers key:
auxiliary:
vision:
provider: opencode-go
model: kimi-k2.6
fallback_providers:
- provider: openrouter
model: google/gemini-3-flash
- provider: opencode-zen
model: minimax-m2.7
timeout: 120
web_extract:
provider: opencode-zen
model: minimax-m2.7
fallback_providers:
- provider: main
model: ""
timeout: 360
Each entry in fallback_providers mirrors the top-level model.fallback_providers format — provider (required), model (optional, defaults to provider default), and optionally base_url / api_key for custom endpoints.
Fallback semantics:
- Each auxiliary invocation tries its configured
provider:model first
- On failure (rate limits, server errors, auth failures — same triggers as main fallback), walk the
fallback_providers list in order
- If all entries are exhausted, fall through to the existing
auto chain as a last resort
- Per-call scope: each auxiliary invocation resets to the primary provider
Implementation Notes
The change surface is relatively contained:
agent/auxiliary_client.py — resolve_auxiliary_client() reads per-task fallback_providers from config, passes to a new resolver path. The existing _resolve_auto and _get_provider_chain remain untouched.
hermes_cli/config.py — add fallback_providers as a recognized key under auxiliary.<task> with schema validation (list of provider:model objects).
hermes_cli/commands.py — optionally extend the hermes fallback interactive manager to cover auxiliary tasks.
Alternatives Considered
-
Making the auto chain configurable — lets users reorder/redefine the auto providers globally. Rejected because it couples all auxiliaries to a single chain and cant express per-task differences (vision needs multimodal, compression doesnt).
-
Simple fallback_model per auxiliary (single provider:model, no list) — simpler but less flexible. The main model already evolved from fallback_model to fallback_providers for good reason (multiple ordered fallbacks). Repeating that evolution on auxiliaries would be wasteful.
-
Status quo — explicit pinning with zero fallback. Viable for users who accept the single-provider risk but suboptimal for production use.
Backward Compatibility
- Absent
fallback_providers -> current behaviour (single explicit provider, or auto chain)
- Present
fallback_providers + explicit provider -> try explicit first, then fallback list
- Present
fallback_providers + provider: auto -> try Step 1 (main provider), then fallback list, then Step 2 (hardcoded chain)
No existing configs break. Its a purely additive change.
Filed by Aldous (AI agent on behalf of Magnus Hedemark)
Feature Description
Each auxiliary task (vision, web_extract, compression, session_search, skills_hub, mcp, flush_memories) should accept an optional
fallback_providerslist, mirroring the existingmodel.fallback_providerspattern at the top level. When the primary provider:model fails, the auxiliary client would walk the fallback list before exhausting.Motivation
The auxiliary client currently offers two resolution modes, neither of which combines determinism with resilience:
Explicit provider:model — deterministic model selection, zero fallback. If the chosen provider is down or rate-limited, the task fails silently. Users who need specific models for specific tasks (e.g., vision needs a multimodal model, compression wants a fast cheap model) have no resilience path.
auto— delegates to a hardcoded resolution chain (_get_provider_chain()inagent/auxiliary_client.py): main provider -> OpenRouter -> Nous -> custom endpoint -> API-key providers. No user control over order, model selection, or which providers participate. Codex is deliberately excluded.Users who want both deterministic model selection per-task AND cross-provider fallback have no way to achieve it today.
Proposed Solution
Extend the auxiliary task config blocks with an optional
fallback_providerskey:Each entry in
fallback_providersmirrors the top-levelmodel.fallback_providersformat —provider(required),model(optional, defaults to provider default), and optionallybase_url/api_keyfor custom endpoints.Fallback semantics:
provider:modelfirstfallback_providerslist in orderautochain as a last resortImplementation Notes
The change surface is relatively contained:
agent/auxiliary_client.py—resolve_auxiliary_client()reads per-taskfallback_providersfrom config, passes to a new resolver path. The existing_resolve_autoand_get_provider_chainremain untouched.hermes_cli/config.py— addfallback_providersas a recognized key underauxiliary.<task>with schema validation (list of provider:model objects).hermes_cli/commands.py— optionally extend thehermes fallbackinteractive manager to cover auxiliary tasks.Alternatives Considered
Making the
autochain configurable — lets users reorder/redefine the auto providers globally. Rejected because it couples all auxiliaries to a single chain and cant express per-task differences (vision needs multimodal, compression doesnt).Simple
fallback_modelper auxiliary (single provider:model, no list) — simpler but less flexible. The main model already evolved fromfallback_modeltofallback_providersfor good reason (multiple ordered fallbacks). Repeating that evolution on auxiliaries would be wasteful.Status quo — explicit pinning with zero fallback. Viable for users who accept the single-provider risk but suboptimal for production use.
Backward Compatibility
fallback_providers-> current behaviour (single explicit provider, or auto chain)fallback_providers+ explicitprovider-> try explicit first, then fallback listfallback_providers+provider: auto-> try Step 1 (main provider), then fallback list, then Step 2 (hardcoded chain)No existing configs break. Its a purely additive change.
Filed by Aldous (AI agent on behalf of Magnus Hedemark)