Skip to content

fix(agent): classify overloaded errors as server-side overload, not rate_limit#14788

Open
Tranquil-Flow wants to merge 2 commits into
NousResearch:mainfrom
Tranquil-Flow:fix/overloaded-error-classification
Open

fix(agent): classify overloaded errors as server-side overload, not rate_limit#14788
Tranquil-Flow wants to merge 2 commits into
NousResearch:mainfrom
Tranquil-Flow:fix/overloaded-error-classification

Conversation

@Tranquil-Flow

@Tranquil-Flow Tranquil-Flow commented Apr 23, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

When a provider returns an overloaded message (e.g. Z.AI code 1305: "temporarily overloaded, please try again later"), the "try again" substring matched _RATE_LIMIT_PATTERNS, causing it to be classified as rate_limit with should_rotate_credential=True. This exhausted the credential pool for what is actually a server-side capacity issue.

Related Issue

Fixes #14038

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 🔒 Security fix
  • 📝 Documentation update
  • ✅ Tests (adding or improving test coverage)
  • ♻️ Refactor (no behavior change)
  • 🎯 New skill (bundled or hub)

Changes Made

  • agent/error_classifier.py: Added overloaded error classification distinct from rate_limit
  • tests/agent/test_error_classifier.py: Tests for overloaded classification (121 passed in full suite)

How to Test

  1. Run:
    python -m pytest -o 'addopts=' tests/agent/test_error_classifier.py -v
  2. Confirm result: 121 passed.

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix/feature (no unrelated commits)
  • I've run pytest tests/ -q and all tests pass
  • I've added tests for my changes (required for bug fixes, strongly encouraged for features)
  • I've tested on my platform: macOS 15 (Darwin 24.6.0)

Documentation & Housekeeping

  • I've updated relevant documentation (README, docs/, docstrings) — or N/A
  • I've updated cli-config.yaml.example if I added/changed config keys — or N/A
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — or N/A
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guide — or N/A
  • I've updated tool descriptions/schemas if I changed tool behavior — or N/A

Screenshots / Logs

python -m pytest -o 'addopts=' tests/agent/test_error_classifier.py -v
# 121 passed

@alt-glitch alt-glitch added type/bug Something isn't working P1 High — major feature broken, no workaround comp/agent Core agent loop, run_agent.py, prompt builder labels Apr 23, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Related to #14038 (issue), #14055 and #14261 (competing fix PRs) — overloaded errors classified as rate_limit, exhausting credential pool.

@Tranquil-Flow Tranquil-Flow force-pushed the fix/overloaded-error-classification branch from df751ef to 20ba77a Compare May 25, 2026 14:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P1 High — major feature broken, no workaround type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

"overloaded" server errors classified as rate_limit, exhausting credential pool

2 participants