fix(codex): credential pool rotation on Responses API soft failures#24173
Open
jmmaloney4 wants to merge 1 commit into
Open
fix(codex): credential pool rotation on Responses API soft failures#24173jmmaloney4 wants to merge 1 commit into
jmmaloney4 wants to merge 1 commit into
Conversation
When the Codex Responses API returns HTTP 200 with response.status = "failed" or "cancelled" (e.g. quota exhaustion), the runtime correctly detects the soft failure but only attempts cross-provider fallback via _try_activate_fallback(). Same-provider credential pool rotation is never invoked, so the agent burns through all pool entries on one exhausted account before falling through to a different provider. Root cause: the response_invalid block (PR NousResearch#15104) added detection for codex soft failures but did not wire in _recover_with_credential_pool(). The exception handler path (429/402) does call pool recovery, but these soft failures bypass the exception handler entirely. Changes: - Add _classify_codex_soft_failure() method to AIAgent that pattern- matches the error message from response.error against billing and rate-limit signals, returning a FailoverReason or None. - Add early pool-recovery block in the response_invalid path for codex_responses mode: classify the error, attempt rotation, and continue on the new credential before falling through to cross- provider fallback. - Billing signals are checked first (insufficient, credits, quota, billing, payment required, usage limit, account deactivated). - Rate-limit signals checked second (rate limit, throttle, retry, too many requests, quota exceeded, requests/tokens per). - Non-quota soft failures (content policy, safety, cancelled) return None and do not trigger pool rotation. Fixes NousResearch#24159
d027fa6 to
611e0a5
Compare
This was referenced May 15, 2026
Closed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
When the Codex Responses API returns HTTP 200 with
response.status = "failed"or"cancelled"(e.g. quota exhaustion), the runtime correctly detects the soft failure but only attempts cross-provider fallback via_try_activate_fallback(). Same-provider credential pool rotation is never invoked, so the agent stays on one exhausted account and eventually falls through to a different provider entirely.Root cause: The
response_invalidblock (added in PR #15104) detects codex soft failures but never calls_recover_with_credential_pool(). The exception handler path (429/402) does call pool recovery, but these soft failures bypass the exception handler entirely — the HTTP response is 200, not an error code.Changes
_classify_codex_soft_failure()onAIAgent— pattern-matches theresponse.errormessage against billing and rate-limit signals, returning aFailoverReasonorNonefor non-quota errors.response_invalidpath forcodex_responsesmode: classify the error, attempt credential rotation, andcontinueon the new credential before falling through to cross-provider fallback.insufficient,credits have been exhausted,billing,payment required,exceeded your current quota,plan does not include,account is deactivated,usage limit,credit balance,top up your credits,quota.rate limit,rate_limit,too many requests,throttl,try again,retry,slow down,quota exceeded,requests per,tokens per.Noneand do not trigger pool rotation.Test plan
tests/agent/test_codex_soft_failure_pool_rotation.py:pytest tests/agent/test_credential_pool.py tests/agent/test_credential_pool_routing.py tests/agent/test_codex_soft_failure_pool_rotation.py— 76/76 passedRisks
None.Related