feat: Vision fallback chain for auxiliary models (Gemini → configured fallback → local) by saved-j · Pull Request #25878 · NousResearch/hermes-agent

saved-j · 2026-05-14T18:25:43Z

Summary

When the primary vision provider fails (payment errors, quota exhaustion, server errors), automatically fall back through a configurable chain instead of raising immediately.

Problem

Gemini free tier (250 req/day) exhausts quickly with Hermes agent usage. Without a fallback chain, all vision requests fail with HTTP 429 until quota resets. The existing `fallback_provider` and `local_fallback_provider` config keys in `auxiliary.vision` were dead keys — the code never read them.

Changes

1. `_resolve_vision_fallback_client()` (new function)

Resolves the configured fallback chain from `auxiliary.vision.{fallback,local_fallback}_{provider,model}`. Tries `fallback_provider` first, then `local_fallback_provider`.

2. `_is_server_error()` (new function)

Detects HTTP 5xx errors (500, 502, 503, 504) that indicate provider-side failures — should trigger fallback rather than retry.

3. `_is_payment_error()` (extended)

Now catches `"quota exceeded"` and `"exceeded your current quota"` messages from Gemini API.

4. `call_llm()` and `async_call_llm()` (modified)

Before raising, check if `task == "vision"` and either:

Client could not be created (`client is None`)
Client was created but API call failed (`first_err is not None`)

If so, resolve and invoke the fallback chain immediately.

Configuration

```yaml
auxiliary:
vision:
provider: gemini
model: gemini-2.5-flash
fallback_provider: Xiaomi-TP
fallback_model: mimo-v2-omni
local_fallback_provider: ollama-launch
local_fallback_model: llama3.2-vision:11b
```

Testing

Verified on production:

Gemini returned HTTP 429 (RESOURCE_EXHAUSTED, quota exceeded)
Fallback resolved Xiaomi-TP (mimo-v2-omni) successfully
Image analysis completed in ~10s via fallback
Tested with multiple images (bicycle cargo trailer with child, bird in hand with EXIF data)

Logs

```
INFO agent.auxiliary_client: Vision fallback: using Xiaomi-TP (mimo-v2-omni)
INFO agent.auxiliary_client: Vision fallback chain: using Xiaomi-TP (mimo-v2-omni)
INFO tools.vision_tools: Image analysis completed (2158 characters)
```

When the primary vision provider (e.g. Gemini) fails with payment errors, server errors (5xx), or quota exhaustion, automatically fall back through the configured fallback_provider → local_fallback_provider chain. Changes: - _resolve_vision_fallback_client(): resolves configured fallback chain from auxiliary.vision.{fallback,local_fallback}_{provider,model} - _is_server_error(): detects 5xx HTTP errors that should trigger fallback - _is_payment_error(): extended to catch 'quota exceeded' messages - call_llm() and async_call_llm(): invoke fallback chain before raising Config example: auxiliary: vision: provider: gemini model: gemini-2.5-flash fallback_provider: Xiaomi-TP fallback_model: mimo-v2-omni local_fallback_provider: ollama-launch local_fallback_model: llama3.2-vision:11b Verified: Gemini 429 → Xiaomi-TP (mimo-v2-omni) fallback chain working. Tested with image analysis: bird in hand (vivo X200 Ultra, 35mm f/1.69).

teknium1 · 2026-06-12T06:54:10Z

This has been implemented on current main by the generalized auxiliary fallback ladder. Thanks for pushing this direction — the merged version uses a slightly different documented config shape (auxiliary.<task>.fallback_chain) but covers the vision fallback behavior and extends it to other auxiliary tasks too.

Automated hermes-sweeper review evidence:

agent/auxiliary_client.py:2352 detects quota/payment exhaustion, including quota exceeded, quota_exceeded, and resource exhausted.
agent/auxiliary_client.py:5434 and agent/auxiliary_client.py:5880 route sync and async auxiliary failures through fallback on payment/quota, connection, and rate-limit capacity errors.
agent/auxiliary_client.py:3049 implements _try_configured_fallback_chain(), reading auxiliary.<task>.fallback_chain in order.
Commit a57424683759617040dd82082d85128deb236de4 added the configurable auxiliary fallback chains plus the main-agent safety net.
website/docs/user-guide/features/fallback-providers.md:297 documents the auxiliary fallback ladder, including a vision example.

One caveat: the merged schema is fallback_chain, not the PR's proposed fallback_provider / local_fallback_provider keys.

alt-glitch added type/feature New feature or request P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder tool/vision Vision analysis and image generation labels May 14, 2026

This was referenced May 16, 2026

[Feature] Configurable fallback chains for auxiliary tasks #26882

Closed

feat(auxiliary): add configurable fallback chains for auxiliary tasks (#26882) #26998

Closed

saved-j mentioned this pull request May 17, 2026

feat: General per-task fallback chains for ALL auxiliary models (vision, compression, STT, summarization, etc.) #27298

Open

alt-glitch mentioned this pull request May 25, 2026

Feature Request: Automatic vision fallback for non-vision primary models #32160

Open

This was referenced May 26, 2026

Proposal: consolidate ~12 auxiliary fallback PRs into per-task fallback_providers #32408

Open

feat(auxiliary): add per-task fallback_providers config for transparent failover #32411

Open

teknium1 closed this Jun 12, 2026

teknium1 added the sweeper:implemented-on-main Sweeper: behavior already present on current main label Jun 12, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Vision fallback chain for auxiliary models (Gemini → configured fallback → local)#25878

feat: Vision fallback chain for auxiliary models (Gemini → configured fallback → local)#25878
saved-j wants to merge 1 commit into
NousResearch:mainfrom
saved-j:feat/vision-fallback-chain

saved-j commented May 14, 2026

Uh oh!

teknium1 commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

saved-j commented May 14, 2026

Summary

Problem

Changes

1. `_resolve_vision_fallback_client()` (new function)

2. `_is_server_error()` (new function)

3. `_is_payment_error()` (extended)

4. `call_llm()` and `async_call_llm()` (modified)

Configuration

Testing

Logs

Related

Uh oh!

teknium1 commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants