Skip to content

Model fallback does not trigger on 429 rate limit errors #11972

@roelsmelt

Description

@roelsmelt

GitHub Issue: Model Fallback Not Triggering on Rate Limit (429)

Repository: https://github.com/punkpeye/openclaw/issues/new


Bug Report

Description

Model fallback chain does not trigger when the primary model returns a 429 (rate limit) error. The error message shows "All models failed (1)" even when multiple fallback models are configured.

Expected Behavior

When the primary model (e.g., google/gemini-3-flash-preview) returns a 429 rate limit error, OpenClaw should automatically try the next model in the fallbacks array (e.g., anthropic/claude-haiku-4-5).

Actual Behavior

OpenClaw attempts only the primary model, fails with 429, and returns:

Agent failed before reply: All models failed (1): google/gemini-3-flash-preview: Provider google is in cooldown (all profiles unavailable) (rate_limit).

The "(1)" indicates only 1 model was attempted, not the 4 configured.

Configuration

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "google/gemini-3-flash-preview",
        "fallbacks": [
          "anthropic/claude-haiku-4-5",
          "moonshot/kimi-k2.5",
          "deepseek/deepseek-chat"  
        ]
      }
    }
  },
  "auth": {
    "profiles": {
      "google:default": { "provider": "google", "mode": "api_key" },
      "anthropic:default": { "provider": "anthropic", "mode": "api_key" },
      "moonshot:default": { "provider": "moonshot", "mode": "api_key" },
      "deepseek:default": { "provider": "deepseek", "mode": "api_key" }
    }
  }
}

All 4 providers have valid API keys configured in auth-profiles.json.

Logs

No fallback attempt is visible in logs. Only the primary model failure:

[diagnostic] lane task error: lane=main durationMs=15388 error="FailoverError: LLM error: {
  "error": {
    "code": 429,
    "message": "Resource has been exhausted (e.g. check quota).",
    "status": "RESOURCE_EXHAUSTED"
  }
}"

Environment

  • OpenClaw version: 2026.2.6-3
  • Platform: Railway (Docker)
  • Node.js: 22.22.0

Workaround

Currently using a provider with higher rate limits (e.g., Anthropic or DeepSeek) as primary model.

Steps to Reproduce

  1. Configure primary model with a provider that has strict rate limits (e.g., Google Gemini free tier)
  2. Configure fallback models with different providers
  3. Ensure all providers have valid API keys
  4. Send messages until primary model hits rate limit
  5. Observe that fallback models are not attempted

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions