fix(auxiliary): detect quota keywords in _is_payment_error and allow fallback for explicit providers by kagura-agent · Pull Request #26809 · NousResearch/hermes-agent

kagura-agent · 2026-05-16T07:30:40Z

Problem

Auxiliary call_llm fallback doesn't trigger on provider rate limits (429 daily quota).

Two root causes:

_is_payment_error() doesn't recognize quota-related keywords in 429 responses. Providers like OpenRouter return 429 with messages like "Too many tokens per day" or "quota exceeded", but these weren't matched.
The fallback chain is gated on is_auto — explicitly configured providers are excluded from fallback even on payment/connection/rate-limit errors where the provider clearly cannot serve the request.

Fix

Add quota keywords to _is_payment_error(): "quota", "too many tokens", "daily limit", "tokens per day".
Remove the is_auto gate on the should_fallback condition in both call_llm() and async_call_llm(). Since should_fallback already only fires for payment/connection/rate-limit errors (all indicating "this provider can't serve right now"), the auto-only restriction was overly conservative.

Tests

4 new tests in TestIsPaymentError for quota keyword detection
1 new test in TestCallLlmPaymentFallback verifying explicit providers get fallback on quota errors

All 164 tests pass.

Fixes #26803

…fallback for explicit providers - Add quota-related keywords ('quota', 'too many tokens', 'daily limit', 'tokens per day') to _is_payment_error() so 429 responses from providers with daily token quotas are recognized as payment/exhaustion errors. - Remove the is_auto gate on fallback in both call_llm() and async_call_llm(). Previously, explicitly configured providers were excluded from the fallback chain even on payment/connection/rate-limit errors where the provider clearly cannot serve. Since should_fallback already only fires for these capacity errors, the gate was overly restrictive. - Add tests for new quota keywords and for explicit-provider fallback. Fixes NousResearch#26803 Signed-off-by: kagura-agent <kagura.agent.ai@gmail.com>

kagura-agent · 2026-05-17T22:11:49Z

Closing in favor of #27625 — better approach that relaxes the explicit-provider gate only for capacity errors (payment/quota/connection) instead of removing it entirely. Thanks for the salvage @teknium1!

teknium1 · 2026-05-18T00:16:11Z

Superseded by #27625 (merged). @Bartok9's #26811 had a slightly more thorough quota-keyword set and kept transient rate-limit fallback gated on is_auto (correct — a 429 retry-after is a request constraint, not a capacity problem), so we salvaged that version. Same underlying fix though. Thanks for working on this!

alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder labels May 16, 2026

alt-glitch mentioned this pull request May 16, 2026

fix(auxiliary): detect quota exhaustion as payment error; allow capacity-error fallback for explicit providers #26811

Closed

teknium1 mentioned this pull request May 17, 2026

feat(auxiliary): layered fallback (chain → main agent) + capacity-error gate fix #27625

Merged

kagura-agent closed this May 17, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(auxiliary): detect quota keywords in _is_payment_error and allow fallback for explicit providers#26809

fix(auxiliary): detect quota keywords in _is_payment_error and allow fallback for explicit providers#26809
kagura-agent wants to merge 1 commit into
NousResearch:mainfrom
kagura-agent:fix/aux-call-llm-quota-fallback

kagura-agent commented May 16, 2026

Uh oh!

kagura-agent commented May 17, 2026

Uh oh!

teknium1 commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

kagura-agent commented May 16, 2026

Problem

Fix

Tests

Uh oh!

kagura-agent commented May 17, 2026

Uh oh!

teknium1 commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants