fix(auxiliary): detect quota keywords in _is_payment_error and allow fallback for explicit providers#26809
Closed
kagura-agent wants to merge 1 commit into
Closed
Conversation
…fallback for explicit providers
- Add quota-related keywords ('quota', 'too many tokens', 'daily limit',
'tokens per day') to _is_payment_error() so 429 responses from providers
with daily token quotas are recognized as payment/exhaustion errors.
- Remove the is_auto gate on fallback in both call_llm() and
async_call_llm(). Previously, explicitly configured providers were
excluded from the fallback chain even on payment/connection/rate-limit
errors where the provider clearly cannot serve. Since should_fallback
already only fires for these capacity errors, the gate was overly
restrictive.
- Add tests for new quota keywords and for explicit-provider fallback.
Fixes NousResearch#26803
Signed-off-by: kagura-agent <kagura.agent.ai@gmail.com>
Contributor
Author
Contributor
|
Superseded by #27625 (merged). @Bartok9's #26811 had a slightly more thorough quota-keyword set and kept transient rate-limit fallback gated on |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Auxiliary
call_llmfallback doesn't trigger on provider rate limits (429 daily quota).Two root causes:
_is_payment_error()doesn't recognize quota-related keywords in 429 responses. Providers like OpenRouter return 429 with messages like "Too many tokens per day" or "quota exceeded", but these weren't matched.The fallback chain is gated on
is_auto— explicitly configured providers are excluded from fallback even on payment/connection/rate-limit errors where the provider clearly cannot serve the request.Fix
Add quota keywords to
_is_payment_error():"quota","too many tokens","daily limit","tokens per day".Remove the
is_autogate on theshould_fallbackcondition in bothcall_llm()andasync_call_llm(). Sinceshould_fallbackalready only fires for payment/connection/rate-limit errors (all indicating "this provider can't serve right now"), the auto-only restriction was overly conservative.Tests
TestIsPaymentErrorfor quota keyword detectionTestCallLlmPaymentFallbackverifying explicit providers get fallback on quota errorsAll 164 tests pass.
Fixes #26803