-
-
Notifications
You must be signed in to change notification settings - Fork 52.6k
Description
OpenClaw: Treat HTTP 503 (and 502/504) as failover-eligible so model fallback triggers
Summary
When the primary model’s API returns 503 Service Unavailable (e.g. Google Gemini “This model is currently experiencing high demand…”), OpenClaw retries the same model and never moves to agents.defaults.model.fallbacks. The run can then stall with no reply to the user.
Model fallback is documented to apply to “auth failures, rate limits, and timeouts.” Today, 503 (and 502/504) are not treated as failover-eligible, so they fall through to message-based classification and often end up as “other” → no fallback.
Expected behavior
- 503, and preferably 502 / 504, should be treated like rate-limit/transient errors: trigger model fallback (next model in
agents.defaults.model.fallbacks) instead of only retrying the same model. - Same for error messages that clearly indicate overload (e.g. “UNAVAILABLE”, “experiencing high demand”) when the error object doesn’t carry an HTTP status.
Actual behavior
- Primary (e.g.
google/gemini-3-flash-preview) returns 503. - Run retries the same model; after tool results, the next completion request again returns 503.
- No switch to fallback (e.g.
minimax/MiniMax-M2.5); user gets no reply.
Observed in session transcript: assistant message with "code":503,"status":"Service Unavailable" and message “This model is currently experiencing high demand. Spikes in demand are usually temporary. Please try again later.” Retries stayed on the same model.
Proposed fix
In src/agents/failover-error.ts, in the function that resolves failover reason from an error (e.g. resolveFailoverReasonFromError):
-
By status code: After handling 408 (timeout), add handling for 503 (and optionally 502, 504) and return
"rate_limit"(so existing fallback/cooldown behavior applies).Example (pseudocode):
if (status === 408) return "timeout"; // Treat server overload / temporary unavailable as rate-limit-like for fallback if (status === 503 || status === 502 || status === 504) return "rate_limit";
-
By message (optional): In
classifyFailoverReason(or equivalent message-based classifier), if the text contains503,UNAVAILABLE, or “high demand” / “experiencing high demand”, return"rate_limit"so that when the error is passed without an HTTP status, fallback still triggers.
Environment
- OpenClaw version: 2026.2.17 (from npm).
- Config:
agents.defaults.model.primary=google/gemini-3-flash-preview,agents.defaults.model.fallbacks= [minimax, kimi, grok-3, ollama]. - Channel: Telegram (direct).
References
- Model failover: “This applies to auth failures, rate limits, and timeouts that exhausted profile rotation (other errors do not advance fallback).”
- Session transcript showed repeated 503 on
gemini-3-flash-previewwith no switch to fallback.
Thank you for considering this; it would make configured fallbacks actually apply when the primary provider is overloaded (503) or temporarily down (502/504).