fix(aux): trigger fallback on 429 rate-limit errors (salvage #13579) by teknium1 · Pull Request #20294 · NousResearch/hermes-agent

teknium1 · 2026-05-05T15:56:43Z

Salvages @zeejaytan's PR #13579 onto current main (conflicts with main's newer Nous-auth-refresh + credential-refresh retry blocks resolved — both preserved).

What it does

Auxiliary calls that 429 with non-billing rate-limit text previously exhausted all retries against the same endpoint instead of falling back. _is_payment_error only matched billing-keyword 429s, so Nous's 'Hold up for a bit' and similar generic rate-limit messages fell through. Adds _is_rate_limit_error and includes it in should_fallback on both the sync and async paths.

Changes

agent/auxiliary_client.py — new _is_rate_limit_error helper; should_fallback in both call_llm and async_call_llm now or's in rate-limit; max_tokens retry path also checks.
tests/agent/test_auxiliary_client.py — new coverage for 429 detection.
scripts/release.py — AUTHOR_MAP entry for zeejaytan.

Validation

tests/agent/test_auxiliary_client.py — 134 passed locally.

Closes #13579 via salvage.

When a provider returns a 429 rate-limit error (not billing-related), the auxiliary client's call_llm/async_call_llm previously did NOT trigger the fallback chain. This caused auxiliary tasks like session_search to exhaust all 3 retries against the same rate-limited endpoint, losing session metadata that depended on the summarization completing. Root cause: `_is_payment_error()` only matched 429s containing billing keywords ("credits", "insufficient funds", etc.). Provider-specific rate-limit messages like Nous's "Hold up for a bit, you've exceeded the rate limit on your API key" didn't match, so `_is_payment_error` returned False, `_is_connection_error` returned False, and `should_fallback` was False — all retries hit the same rate-limited provider. Fix: - New `_is_rate_limit_error()` function that detects 429 + rate-limit keywords, generic 429 without billing keywords, and OpenAI SDK `RateLimitError` class instances (which may omit .status_code). - Updated `should_fallback` in both `call_llm` and `async_call_llm` to include `_is_rate_limit_error`. - Updated the max_tokens retry path to also check for rate-limit errors. - Updated the reason string to include "rate limit". This complements the Nous rate guard (PR #10568) which prevents new calls to Nous when already rate-limited — this fix handles the case where a request is already in flight when the 429 arrives. Related: #8023, #12554, #11034 Co-authored-by: Zeejay <zjtan1@gmail.com>

alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder labels May 5, 2026

zeejaytan and others added 2 commits May 5, 2026 10:15

chore: AUTHOR_MAP entry for zeejaytan

76bbb84

teknium1 force-pushed the salvage/pr-13579 branch from 7ee2f49 to 76bbb84 Compare May 5, 2026 17:15

teknium1 merged commit dbe9b15 into main May 5, 2026
9 of 10 checks passed

teknium1 deleted the salvage/pr-13579 branch May 5, 2026 17:16

ddoKx mentioned this pull request May 5, 2026

[Bug] Interactive CLI session does not auto-fallback on Codex 429 'usage_limit_reached', while cron jobs with the same fallback chain do #20465

Closed

alt-glitch mentioned this pull request May 16, 2026

Auxiliary call_llm fallback doesn't trigger on provider rate limits (429 daily quota) #26803

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(aux): trigger fallback on 429 rate-limit errors (salvage #13579)#20294

fix(aux): trigger fallback on 429 rate-limit errors (salvage #13579)#20294
teknium1 merged 2 commits into
mainfrom
salvage/pr-13579

teknium1 commented May 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

teknium1 commented May 5, 2026

What it does

Changes

Validation

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants