Skip to content

fix(agent): try fallback providers at init when primary credential pool is exhausted#17958

Closed
luyao618 wants to merge 1 commit into
NousResearch:mainfrom
luyao618:fix/credential-pool-init-fallback
Closed

fix(agent): try fallback providers at init when primary credential pool is exhausted#17958
luyao618 wants to merge 1 commit into
NousResearch:mainfrom
luyao618:fix/credential-pool-init-fallback

Conversation

@luyao618

Copy link
Copy Markdown
Contributor

Summary

When a provider's credential_pool contains only one entry and that entry is in 429-cooldown, resolve_provider_client returns None and AIAgent.__init__ raises a misleading RuntimeError suggesting the API key is missing — even when valid fallback_providers are configured.

This patch makes __init__ iterate the fallback chain before raising, mirroring the existing in-flight fallback logic in the request loop. If a fallback resolves, the agent initializes against it and sets _fallback_activated=True so _restore_primary_runtime can pick the primary back up after cooldown.

Changes

  • run_agent.py (~1465): Before raising RuntimeError for missing credentials, iterate fallback_model entries and try resolve_provider_client on each. If one resolves, use it as the effective primary.
  • run_agent.py (~1560): Preserve _fallback_activated flag set during init-time fallback (was unconditionally reset to False).
  • run_agent.py (~1501): Guard the "No provider configured" raise to skip when fallback was activated.

Test plan

  • Added tests/run_agent/test_init_fallback_on_exhausted_pool.py with two tests:
    1. test_init_tries_fallback_when_primary_returns_none — verifies agent initializes with fallback provider
    2. test_init_raises_when_no_fallback_configured — verifies original error is preserved when no fallback exists
  • All 1166 existing tests/run_agent/ tests pass

Closes #17929

…ol is exhausted (NousResearch#17929)

When a provider's credential pool has a single entry in 429-cooldown,
resolve_provider_client returns None and AIAgent.__init__ raises a
misleading RuntimeError suggesting the API key is missing — even when
valid fallback_providers are configured.

This patch makes __init__ iterate the fallback chain before raising,
mirroring the existing in-flight fallback logic in the request loop.
If a fallback resolves, the agent initializes against it and sets
_fallback_activated=True so _restore_primary_runtime can pick the
primary back up after cooldown.

Closes NousResearch#17929
@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder labels Apr 30, 2026
@teknium1

teknium1 commented May 2, 2026

Copy link
Copy Markdown
Contributor

Merged via #18762 — cherry-picked cleanly onto current main with authorship preserved (rebase-merge). Thanks for the thorough fix and test coverage!

#18762

@teknium1 teknium1 closed this May 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Single-key credential pool: rate-limit cooldown causes init-time RuntimeError, fallback chain never tried

3 participants