-
-
Notifications
You must be signed in to change notification settings - Fork 52.5k
Description
GitHub Issue: Model Fallback Not Triggering on Rate Limit (429)
Repository: https://github.com/punkpeye/openclaw/issues/new
Bug Report
Description
Model fallback chain does not trigger when the primary model returns a 429 (rate limit) error. The error message shows "All models failed (1)" even when multiple fallback models are configured.
Expected Behavior
When the primary model (e.g., google/gemini-3-flash-preview) returns a 429 rate limit error, OpenClaw should automatically try the next model in the fallbacks array (e.g., anthropic/claude-haiku-4-5).
Actual Behavior
OpenClaw attempts only the primary model, fails with 429, and returns:
Agent failed before reply: All models failed (1): google/gemini-3-flash-preview: Provider google is in cooldown (all profiles unavailable) (rate_limit).
The "(1)" indicates only 1 model was attempted, not the 4 configured.
Configuration
{
"agents": {
"defaults": {
"model": {
"primary": "google/gemini-3-flash-preview",
"fallbacks": [
"anthropic/claude-haiku-4-5",
"moonshot/kimi-k2.5",
"deepseek/deepseek-chat"
]
}
}
},
"auth": {
"profiles": {
"google:default": { "provider": "google", "mode": "api_key" },
"anthropic:default": { "provider": "anthropic", "mode": "api_key" },
"moonshot:default": { "provider": "moonshot", "mode": "api_key" },
"deepseek:default": { "provider": "deepseek", "mode": "api_key" }
}
}
}All 4 providers have valid API keys configured in auth-profiles.json.
Logs
No fallback attempt is visible in logs. Only the primary model failure:
[diagnostic] lane task error: lane=main durationMs=15388 error="FailoverError: LLM error: {
"error": {
"code": 429,
"message": "Resource has been exhausted (e.g. check quota).",
"status": "RESOURCE_EXHAUSTED"
}
}"
Environment
- OpenClaw version: 2026.2.6-3
- Platform: Railway (Docker)
- Node.js: 22.22.0
Workaround
Currently using a provider with higher rate limits (e.g., Anthropic or DeepSeek) as primary model.
Steps to Reproduce
- Configure primary model with a provider that has strict rate limits (e.g., Google Gemini free tier)
- Configure fallback models with different providers
- Ensure all providers have valid API keys
- Send messages until primary model hits rate limit
- Observe that fallback models are not attempted