Feature Description
Add configurable fallback chains for auxiliary tasks so that when the primary provider fails (quota, rate limit, connection), the system automatically tries alternative providers instead of failing silently.
Problem
Currently, auxiliary tasks like vision, tts, and compression have a single provider. When that provider fails:
- Vision: image analysis silently fails, agent loses visual context
- Compression: context compaction fails, conversation history dropped without summary
- TTS: voice output fails, user gets no audio
See #26803 for a detailed analysis of the fallback chain gap in call_llm.
Proposed Config
auxiliary:
vision:
provider: glm
model: glm-4v-flash
fallback_chain:
- provider: openrouter
model: google/gemini-3-flash-preview
- provider: nous
model: claude-sonnet-4
compression:
provider: openrouter
fallback_chain:
- provider: openai
model: gpt-4o-mini
Behavior
- Try primary provider first
- On fallback-worthy errors (429 quota, 402 payment, connection timeout, auth failure), try the next entry in
fallback_chain
- If all entries fail, raise the last error
- Log which provider was actually used for observability
This extends the existing fallback logic in call_llm (currently gated on is_auto) to work with explicitly configured providers.
Related
Feature Description
Add configurable fallback chains for auxiliary tasks so that when the primary provider fails (quota, rate limit, connection), the system automatically tries alternative providers instead of failing silently.
Problem
Currently, auxiliary tasks like
vision,tts, andcompressionhave a single provider. When that provider fails:See #26803 for a detailed analysis of the fallback chain gap in
call_llm.Proposed Config
Behavior
fallback_chainThis extends the existing fallback logic in
call_llm(currently gated onis_auto) to work with explicitly configured providers.Related