fix(credential-pool): correctness + rotation + cross-process sync#15120
Merged
Conversation
The least_used strategy selected entries via min(request_count) but never incremented the counter. All entries stayed at count=0, so the strategy degenerated to fill_first behavior with no actual load balancing. Now increments request_count after each selection and persists the update.
Previously _handle_credential_pool_error handled 401, 402, and 429 but silently ignored 403. When a provider returns 403 for a revoked or unauthorised credential (e.g. Nous agent_key invalidated by a newer login), the pool was never rotated and every subsequent request continued to use the same failing credential. Treat 403 the same as 402: immediately mark the current credential exhausted and rotate to the next pool entry, since a Forbidden response will not resolve itself with a retry. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Extracts pool-rotation-room logic into `_pool_may_recover_from_rate_limit` so single-credential pools no longer block the eager-fallback path on 429. The existing check `pool is not None and pool.has_available()` lets fallback fire only after the pool marks every entry as exhausted. With exactly one credential in the pool (the common shape for Gemini OAuth, Vertex service accounts, and any personal-key setup), `has_available()` flips back to True as soon as the cooldown expires — Hermes retries against the same entry, hits the same daily-quota 429, and burns the retry budget in a tight loop before ever reaching the configured `fallback_model`. Observed in the wild as 4+ hours of 429 noise on a single Gemini key instead of falling through to Vertex as configured. Rotation is only meaningful with more than one credential — gate on `len(pool.entries()) > 1`. Multi-credential pools keep the current wait-for-rotation behaviour unchanged. Fixes #11314. Related to #8947, #10210, #7230. Narrower scope than open PRs #8023 (classifier change) and #11492 (503/529 credential-pool bypass) — this addresses the single-credential 429 case specifically and does not conflict with either. Tests: 6 new unit tests in tests/run_agent/test_provider_fallback.py covering (a) None pool, (b) single-cred available, (c) single-cred in cooldown, (d) 2-cred available rotates, (e) multi-cred all cooling-down falls back, (f) many-cred available rotates. All 18 tests in the file pass.
Concurrent Hermes processes (e.g. cron jobs) refreshing a Nous OAuth token via resolve_nous_runtime_credentials() write the rotated tokens to auth.json. The calling process's pool entry becomes stale, and the next refresh against the already-rotated token triggers a 'refresh token reuse' revocation on the Nous Portal. _sync_nous_entry_from_auth_store() reads auth.json under the same lock used by resolve_nous_runtime_credentials, and adopts the newer token pair before refreshing the pool entry. This complements #15111 (which preserved the obtained_at timestamps through seeding). Partial salvage of #10160 by @konsisumer — only the agent/credential_pool.py changes + the 3 Nous-specific regression tests. The PR also touched 10 unrelated files (Dockerfile, tips.py, various tool tests) which were dropped as scope creep. Regression tests: - test_sync_nous_entry_from_auth_store_adopts_newer_tokens - test_sync_nous_entry_noop_when_tokens_match - test_nous_exhausted_entry_recovers_via_auth_store_sync
This was referenced Apr 24, 2026
This was referenced Apr 27, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Consolidated salvage of 7 credential-pool and auth-related PRs — pool strategy correctness, auth-failure recovery paths, UX, and cross-process OAuth sync. Attribution preserved via rebase-merge.
Changes
agent/credential_pool.py(least_used fix) —STRATEGY_LEAST_USEDpicked the min-request_countentry but never incremented the counter. Every call returned the same entry. Fix: increment viareplace(entry, request_count=...)on selection. From @vominh1919 (fix: increment request_count in least_used credential pool strategy #14631).run_agent.py(403 → pool rotation) — one-line fix:status_code in (401, 403)instead of== 401in_recover_with_credential_pool. 403 (Forbidden) is a valid signal to rotate off a credential. From @YueLich (fix: rotate credential pool on 403 (Forbidden) responses #8289).hermes_cli/auth_commands.py(auth list UX) —_classify_exhausted_status()distinguishes rate-limited (429/ "rate_limit" / "quota" markers) from auth failures (401/403/ "invalid_token" / "unauthorized") inhermes auth listoutput. Auth-failure entries get "re-auth may be required" instead of a fake cooldown timer. From @andyylin (fix: clarify auth list exhausted status #14449).hermes_cli/model_switch.py(/model picker detection) — mapped providers (e.g. Gemini) whose credentials live only inauth.json:credential_poolwere invisible tolist_authenticated_providers(). Now checks the auth store before decidinghas_creds=False. From @jakubkrcmar (fix(model): detect auth-store credential_pool for mapped providers #5753).run_agent.py(single-cred rate-limit fallback) —_pool_may_recover_from_rate_limit()— when the primary credential 429s and the pool has only one entry, rotation has nowhere to go; retrying the same exhausted quota burns the retry budget. Now falls back to the configuredfallback_modelinstead. Fixes [Feature]: 遇到http 529错误的时候,应该尝试切换成fallback的模型 #11314. From @prasadus92 (fix(agent): fall back on rate limit when pool has no rotation room #12554).hermes_cli/auth.py+hermes_cli/status.py(Nous status validation) —get_nous_auth_status()refactored to prefer the auth-store provider state (runtime source of truth) over pool snapshots. Pool now read via_snapshot_nous_pool_status()as a fallback only. Also picks the newest entry by agent_key expiry when multiple exist. From @mssteuer (fix: validate nous auth status against runtime credentials #14671).agent/credential_pool.py(Nous cross-process sync) —_sync_nous_entry_from_auth_store()— before refresh, check if a concurrent process rotated the refresh token viaresolve_nous_runtime_credentials()and wrote toauth.json. If so, adopt the newer pair instead of refreshing against a stale token (which triggers "refresh token reuse" revocation). Complements fix(nous-oauth): preserve obtained_at in pool + actionable message on RT reuse #15111 (timestamp preservation). Partial salvage of @konsisumer's fix(credential_pool): add Nous OAuth cross-process sync to prevent session revocation #10160 — core change only, dropped 10 unrelated files (Dockerfile, tips.py, random tool tests) as scope creep.Credit
Validation
208/208 passing.
Not included in this salvage
mainbefore the design trade-offs can be evaluated. Kept open.