Skip to content

fix(failover): recognize model_cooldown as rate-limit for fallback#17231

Closed
thebtf wants to merge 1 commit intoopenclaw:mainfrom
thebtf:fix/model-cooldown-failover
Closed

fix(failover): recognize model_cooldown as rate-limit for fallback#17231
thebtf wants to merge 1 commit intoopenclaw:mainfrom
thebtf:fix/model-cooldown-failover

Conversation

@thebtf
Copy link

@thebtf thebtf commented Feb 15, 2026

Summary

  • CLIProxyAPI Plus returns model_cooldown: All credentials for model X are cooling down when all API keys for a model hit 429 rate limits
  • This error was not matched by any ERROR_PATTERNS entry, causing classifyFailoverReason() to return null
  • As a result, runWithModelFallback() threw immediately instead of advancing to the next candidate in the fallback chain
  • Added "model_cooldown" and "cooling down" to the rateLimit pattern list so the error triggers model fallback

Details

When using OpenClaw with CLIProxyAPI Plus as a provider, if all API credentials for a given model enter cooldown (due to upstream 429 rate limits), CLIProxyAPI returns an error with type model_cooldown and message text containing "cooling down".

Without this fix, coerceToFailoverError() cannot wrap this as a FailoverError, so the error is thrown directly to the user instead of triggering fallback to the next model in the chain.

Validation

  • pnpm build — passes
  • pnpm check — passes
  • pnpm test — passes (no existing tests cover this specific pattern; the change adds string literals to an existing array)
  • Tested manually with CLIProxyAPI Plus: confirmed fallback triggers on model_cooldown errors

Contribution checklist

  • Focused scope: 2-line addition of string patterns to an existing array — single concern
  • What + why: described above
  • AI-assisted: Yes, Claude Code was used to identify the root cause and draft the fix. The change was reviewed and tested manually. Testing level: lightly tested (manual verification with live CLIProxyAPI Plus instance)

Test plan

  • Configure a model with fallback chain (e.g., gemini-3-pro-previewminimax-m2.5-free)
  • Trigger model_cooldown error from CLIProxyAPI (rate-limit all credentials for primary model)
  • Verify agent falls back to next model instead of showing error to user
  • Verify classifyFailoverReason("model_cooldown: All credentials...") returns "rate_limit"

Greptile Summary

Adds "model_cooldown" and "cooling down" to the rateLimit error patterns in ERROR_PATTERNS, so that CLIProxyAPI Plus cooldown errors (returned when all API keys for a model hit upstream 429 rate limits) are correctly classified as rate-limit errors by classifyFailoverReason(). Without this fix, these errors were unrecognized, causing coerceToFailoverError() to return null and runWithModelFallback() to throw immediately instead of advancing to the next model in the fallback chain.

  • The change is a 2-line addition of string literals to an existing array — minimal scope, no structural changes.
  • The new patterns are specific enough to avoid false positives with other error categories (auth, billing, timeout, etc.).
  • No tests were added; the PR author notes this is consistent with the existing pattern — other string entries in ERROR_PATTERNS also lack individual test coverage.

Confidence Score: 5/5

  • This PR is safe to merge — it adds two string literals to an existing pattern list with no risk of behavioral regression.
  • The change is a 2-line addition of string patterns to an existing array. The matching logic (matchesErrorPatterns using .includes()) is well-established and used by all other entries. The new strings are specific to CLIProxyAPI Plus error responses and won't false-positive against other error categories. The failover pipeline (classifyFailoverReasoncoerceToFailoverErrorrunWithModelFallback) is thoroughly understood and the fix correctly slots into the existing architecture.
  • No files require special attention.

Last reviewed commit: 0cac9be

@thebtf
Copy link
Author

thebtf commented Feb 16, 2026

Thanks for the review! Updated the PR description to include the contribution checklist: local validation results, scope confirmation, and AI-assistance transparency. Let me know if anything else is needed.

@thebtf thebtf force-pushed the fix/model-cooldown-failover branch 2 times, most recently from e905f75 to 8b1040d Compare February 16, 2026 17:49
@openclaw-barnacle openclaw-barnacle bot added docs Improvements or additions to documentation channel: telegram Channel integration: telegram commands Command implementations size: XL and removed size: XS labels Feb 16, 2026
@thebtf thebtf force-pushed the fix/model-cooldown-failover branch from 632087c to 0cac9be Compare February 16, 2026 18:41
@openclaw-barnacle openclaw-barnacle bot added size: XS and removed docs Improvements or additions to documentation channel: telegram Channel integration: telegram commands Command implementations size: XL labels Feb 16, 2026
@steipete steipete closed this Feb 16, 2026
@steipete steipete reopened this Feb 17, 2026
@thebtf thebtf force-pushed the fix/model-cooldown-failover branch 2 times, most recently from 7fc8f4c to dd06017 Compare February 17, 2026 20:56
@thebtf thebtf force-pushed the fix/model-cooldown-failover branch from dd06017 to 530fc5e Compare February 20, 2026 02:00
CLIProxyAPI Plus returns `model_cooldown: All credentials for model X
are cooling down` when all API keys for a model hit 429 rate limits.

This error was not matched by any `ERROR_PATTERNS` entry, so
`classifyFailoverReason()` returned `null` and `coerceToFailoverError()`
could not wrap it as a `FailoverError`. As a result,
`runWithModelFallback()` threw immediately instead of advancing to the
next candidate in the fallback chain.

Add `"model_cooldown"` (error type) and `"cooling down"` (message text)
to the `rateLimit` pattern list so the error is correctly classified as
`rate_limit` and triggers model fallback.
@thebtf thebtf force-pushed the fix/model-cooldown-failover branch from 530fc5e to 8ae0593 Compare February 21, 2026 01:21
@thebtf thebtf closed this Feb 23, 2026
@thebtf thebtf deleted the fix/model-cooldown-failover branch February 23, 2026 00:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling size: XS

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants