fix(pool): rotate pooled credentials immediately on usage_limit or auth failure#10282
Closed
redf0x1 wants to merge 1 commit into
Closed
fix(pool): rotate pooled credentials immediately on usage_limit or auth failure#10282redf0x1 wants to merge 1 commit into
redf0x1 wants to merge 1 commit into
Conversation
…th failure Core credential pool failover fix: - prevent infinite retry loop when pooled credential becomes exhausted (usage_limit, rate_limit) - rotate to next available credential immediately on usage_limit or auth failure - persist reset window timestamps to prevent retry scans before reset completes - mark exhausted credentials properly so pool selection respects state Implementation: - credential_pool: fallback token validation (when Codex hardening module unavailable) - run_agent: immediate rotation via mark_exhausted_and_rotate when refresh fails - failure modes: fail-fast with reset hint when no pooled credentials available Test coverage: - credential pool rotation and exhaustion - force refresh on auth failure returns None (enabling pool rotation) - selected entry usage tracking This fix addresses the core failover incident. Codex auth status normalization is a separate concern and will be in a follow-up PR. Scope: core failover logic only (Codex auth hardening cut for PR NousResearch#3)
0252c54 to
a158ddb
Compare
Collaborator
Contributor
|
Thanks for the contribution @redf0x1 — this automated hermes-sweeper review found that the changes in this PR are fully covered by the already-merged #15120 ("fix(credential-pool): correctness + rotation + cross-process sync", merged 2026-04-24, commit Specifically, main now has:
@alt-glitch's 2026-04-26 comment correctly identified the supersession. Closing as implemented on main. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
When a pooled credential (for example
openai-codex) becomes exhausted (usage_limit_reachedor auth refresh failure), Hermes could keep retrying the same broken credential instead of rotating immediately to another usable pooled entry.Symptoms:
usage limiterrors with no forward progressRoot cause
CredentialPool._refresh_entry(force=True)could still hand back the current credential instead of definitively failing refresh_recover_with_credential_pool()only rotated when refresh returnedNoneFix
Runtime changes
agent/credential_pool.pyrun_agent.pymark_exhausted_and_rotate()immediatelyScope
Included:
Excluded:
Testing
Related