You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When the OpenAI Codex Responses API returns quota exhaustion as HTTP 200 with response.status = "failed", the runtime correctly detects the soft failure (via the response_invalid block added in #15104) but routes recovery exclusively through _try_activate_fallback() (cross-provider). The response_invalid block never calls _recover_with_credential_pool() or mark_exhausted_and_rotate(), so same-provider credential pool rotation is dead for this error path.
Environment
Hermes Agent v0.13.0 (2026.5.7)
Provider: openai-codex (3 OAuth accounts in credential pool, round_robin strategy)
Python 3.13, OpenAI SDK 2.31.0
Reproduction
Configure multiple openai-codex OAuth accounts in the credential pool (hermes auth add openai-codex --type oauth --label <name>)
Set credential_pool_strategies.openai-codex: round_robin (or any strategy)
Drain the quota on the first pool entry (priority 0)
Run any hermes session using openai-codex
Observe: the runtime retries the same exhausted account 3 times with backoff, then falls through to cross-provider fallback (if configured) or gives up. It never rotates to the next pool entry.
Calls _try_activate_fallback() — cross-provider only
If no fallback: retries with exponential backoff on the same credentials
After max_retries (default 3): gives up
Zero calls to _recover_with_credential_pool() or mark_exhausted_and_rotate() in the entire response_invalid block. The pool entry is never marked exhausted, so subsequent sessions may also select the same dead entry.
The OpenAI Responses API returns quota exhaustion as a "soft failure":
HTTP 200 (success from SDK perspective)
response.status = "failed" (not "completed")
response.error contains quota/rate-limit message
response.output is empty
The OpenAI SDK does not throw an exception for this. The error is only detectable by inspecting response.status and response.error. This means Path 1 never fires for Codex quota exhaustion — only Path 2.
Proposed fix
In the response_invalid block for codex_responses, when response.status in {"failed", "cancelled"} and the error message contains quota/billing/rate-limit signals, call self._recover_with_credential_pool(reason) with an appropriate FailoverReason before falling through to _try_activate_fallback().
Pseudocode for the insertion point around line 12362:
# After setting response_invalid = True and error_details:if_codex_resp_statusin {"failed", "cancelled"}:
# Classify the error for pool recoveryerror_reason=self._classify_codex_soft_failure(_codex_error_msg)
iferror_reason:
recovered, _=self._recover_with_credential_pool(error_reason)
ifrecovered:
retry_count=0compression_attempts=0primary_recovery_attempted=Falsecontinue# retry with new credentials
The error classification logic from agent/error_classifier.py could be reused to detect quota signals from response.error.message and response.error.code. If the error is not quota-related (e.g. content policy), pool rotation should not fire.
Workaround
Start a fresh session (/new or restart hermes) — the pool will select a different entry on cold start
If all entries are exhausted, manually reset: hermes auth reset openai-codex
Summary
When the OpenAI Codex Responses API returns quota exhaustion as HTTP 200 with
response.status = "failed", the runtime correctly detects the soft failure (via theresponse_invalidblock added in #15104) but routes recovery exclusively through_try_activate_fallback()(cross-provider). Theresponse_invalidblock never calls_recover_with_credential_pool()ormark_exhausted_and_rotate(), so same-provider credential pool rotation is dead for this error path.Environment
Reproduction
hermes auth add openai-codex --type oauth --label <name>)credential_pool_strategies.openai-codex: round_robin(or any strategy)Root cause
Two error-handling paths exist in
run_agent.py:Path 1: HTTP Exception Handler (pool rotation WORKS)
Lines ~13170+ in the
exceptblock around the API call loop.When OpenAI SDK throws
APIStatusError(HTTP 429, 402, etc.), the handler calls:classify_api_error()→FailoverReason_recover_with_credential_pool(reason)→mark_exhausted_and_rotate()→ selects next pool entryPath 2: Invalid Response Handler (pool rotation DOES NOT FIRE)
Lines ~12335-12600 in the
response_invalidblock.When the Responses API returns HTTP 200 with
response.status = "failed"(quota exhaustion), the handler:_codex_resp_status in {"failed", "cancelled"}(correctly, added in fix(codex): consolidated OAuth error parsing + failed-status fallback routing + reauth UX #15104)response_invalid = True_try_activate_fallback()— cross-provider onlymax_retries(default 3): gives upZero calls to
_recover_with_credential_pool()ormark_exhausted_and_rotate()in the entireresponse_invalidblock. The pool entry is never marked exhausted, so subsequent sessions may also select the same dead entry.This grep confirms:
Why the Responses API does this
The OpenAI Responses API returns quota exhaustion as a "soft failure":
response.status = "failed"(not "completed")response.errorcontains quota/rate-limit messageresponse.outputis emptyThe OpenAI SDK does not throw an exception for this. The error is only detectable by inspecting
response.statusandresponse.error. This means Path 1 never fires for Codex quota exhaustion — only Path 2.Proposed fix
In the
response_invalidblock forcodex_responses, whenresponse.status in {"failed", "cancelled"}and the error message contains quota/billing/rate-limit signals, callself._recover_with_credential_pool(reason)with an appropriateFailoverReasonbefore falling through to_try_activate_fallback().Pseudocode for the insertion point around line 12362:
The error classification logic from
agent/error_classifier.pycould be reused to detect quota signals fromresponse.error.messageandresponse.error.code. If the error is not quota-related (e.g. content policy), pool rotation should not fire.Workaround
/newor restart hermes) — the pool will select a different entry on cold starthermes auth reset openai-codexfill_first, manually swap priority 0:hermes auth remove openai-codex 1 && hermes auth add openai-codex --type oauth --label <name>Related
response.status == "failed"detection but only routes to cross-provider fallback