fix(error_classifier): classify 'overloaded' message as FailoverReason.overloaded by ms-alan · Pull Request #14055 · NousResearch/hermes-agent

ms-alan · 2026-04-22T16:06:21Z

Summary

When a provider (e.g. Z.AI) returns a 'temporarily overloaded' error (HTTP 200 with code 1305, or HTTP 400), it was being classified as with . After 2 failures, the single API key was marked exhausted, causing all further retries to fail immediately.

The fix adds an 'overloaded' / 'temporarily overloaded' pattern check before the rate_limit check in both and . Overloaded errors now get (retryable, should_fallback) instead of , preventing unnecessary credential rotation.

Changes

: Added overloaded pattern check before rate_limit in (~line 594) and (~line 736)

Root cause

contains , but 'overloaded' error messages from providers like Z.AI were matching as generic rate limits. The flag caused the credential pool to mark the API key as exhausted after just 2 transient errors.

…aded before rate_limit When a provider (e.g. Z.AI) returns 'The service may be temporarily overloaded, please try again later' as HTTP 200 or HTTP 400, the error was matched against _RATE_LIMIT_PATTERNS (which includes 'servicequotaexceededexception') and classified as rate_limit with should_rotate_credential=True. After 2 failures the single API key was marked exhausted and all further retries failed. The fix adds an 'overloaded' / 'temporarily overloaded' pattern check BEFORE the rate_limit check in both _classify_400 and _classify_by_message. Overloaded errors now get FailoverReason.overloaded (retryable, should_fallback) instead of rate_limit, preventing unnecessary credential rotation. Closes NousResearch#14038

When a provider returns 503 (Service Unavailable) or 529 (Overloaded), the agent should fall back to an alternate provider immediately. Credential-pool rotation cannot fix provider-side overload — rotating keys against the same overloaded servers is useless. Two minimal changes: 1. error_classifier: set should_fallback=True for 503/529 (consistent with rate_limit and billing classifications) 2. run_agent: add independent eager-fallback block for overloaded, placed after the rate-limit pool-rotation deferral block. Overloaded bypasses the _pool_may_recover_from_rate_limit check because credential rotation cannot resolve provider-side capacity issues. More focused than adding overloaded to the is_rate_limited tuple and complementary to NousResearch#14055 (message-pattern classification path).

alt-glitch added type/bug Something isn't working P1 High — major feature broken, no workaround comp/agent Core agent loop, run_agent.py, prompt builder labels Apr 22, 2026

pazyork mentioned this pull request Apr 25, 2026

fix(fallback): trigger eager fallback on 503/529 provider overload #15666

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(error_classifier): classify 'overloaded' message as FailoverReason.overloaded#14055

fix(error_classifier): classify 'overloaded' message as FailoverReason.overloaded#14055
ms-alan wants to merge 1 commit into
NousResearch:mainfrom
ms-alan:fix/ISSUE-14038-overloaded-error-classification

ms-alan commented Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ms-alan commented Apr 22, 2026

Summary

Changes

Root cause

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants