[Bug] Codex Responses API soft failures (status=failed) bypass credential pool rotation

## Summary

When the OpenAI Codex Responses API returns quota exhaustion as HTTP 200 with `response.status = "failed"`, the runtime correctly detects the soft failure (via the `response_invalid` block added in #15104) but routes recovery exclusively through `_try_activate_fallback()` (cross-provider). The `response_invalid` block never calls `_recover_with_credential_pool()` or `mark_exhausted_and_rotate()`, so same-provider credential pool rotation is dead for this error path.

## Environment

- Hermes Agent v0.13.0 (2026.5.7)
- Provider: openai-codex (3 OAuth accounts in credential pool, round_robin strategy)
- Python 3.13, OpenAI SDK 2.31.0

## Reproduction

1. Configure multiple openai-codex OAuth accounts in the credential pool (`hermes auth add openai-codex --type oauth --label <name>`)
2. Set `credential_pool_strategies.openai-codex: round_robin` (or any strategy)
3. Drain the quota on the first pool entry (priority 0)
4. Run any hermes session using openai-codex
5. Observe: the runtime retries the same exhausted account 3 times with backoff, then falls through to cross-provider fallback (if configured) or gives up. It never rotates to the next pool entry.

## Root cause

Two error-handling paths exist in `run_agent.py`:

### Path 1: HTTP Exception Handler (pool rotation WORKS)
Lines ~13170+ in the `except` block around the API call loop.

When OpenAI SDK throws `APIStatusError` (HTTP 429, 402, etc.), the handler calls:
1. `classify_api_error()` → `FailoverReason`
2. `_recover_with_credential_pool(reason)` → `mark_exhausted_and_rotate()` → selects next pool entry
3. Retries with new credentials

### Path 2: Invalid Response Handler (pool rotation DOES NOT FIRE)
Lines ~12335-12600 in the `response_invalid` block.

When the Responses API returns HTTP 200 with `response.status = "failed"` (quota exhaustion), the handler:
1. Detects `_codex_resp_status in {"failed", "cancelled"}` (correctly, added in #15104)
2. Sets `response_invalid = True`
3. Calls `_try_activate_fallback()` — cross-provider only
4. If no fallback: retries with exponential backoff on the **same credentials**
5. After `max_retries` (default 3): gives up

**Zero calls to `_recover_with_credential_pool()` or `mark_exhausted_and_rotate()` in the entire `response_invalid` block.** The pool entry is never marked exhausted, so subsequent sessions may also select the same dead entry.

This grep confirms:

```
$ grep -c "_recover_with_credential_pool\|mark_exhausted" <lines 12335-12600>
0
```

## Why the Responses API does this

The OpenAI Responses API returns quota exhaustion as a "soft failure":
- HTTP 200 (success from SDK perspective)
- `response.status = "failed"` (not "completed")
- `response.error` contains quota/rate-limit message
- `response.output` is empty

The OpenAI SDK does not throw an exception for this. The error is only detectable by inspecting `response.status` and `response.error`. This means Path 1 never fires for Codex quota exhaustion — only Path 2.

## Proposed fix

In the `response_invalid` block for `codex_responses`, when `response.status in {"failed", "cancelled"}` and the error message contains quota/billing/rate-limit signals, call `self._recover_with_credential_pool(reason)` with an appropriate `FailoverReason` before falling through to `_try_activate_fallback()`.

Pseudocode for the insertion point around line 12362:

```python
# After setting response_invalid = True and error_details:
if _codex_resp_status in {"failed", "cancelled"}:
    # Classify the error for pool recovery
    error_reason = self._classify_codex_soft_failure(_codex_error_msg)
    if error_reason:
        recovered, _ = self._recover_with_credential_pool(error_reason)
        if recovered:
            retry_count = 0
            compression_attempts = 0
            primary_recovery_attempted = False
            continue  # retry with new credentials
```

The error classification logic from `agent/error_classifier.py` could be reused to detect quota signals from `response.error.message` and `response.error.code`. If the error is not quota-related (e.g. content policy), pool rotation should not fire.

## Workaround

1. Start a fresh session (`/new` or restart hermes) — the pool will select a different entry on cold start
2. If all entries are exhausted, manually reset: `hermes auth reset openai-codex`
3. If using `fill_first`, manually swap priority 0: `hermes auth remove openai-codex 1 && hermes auth add openai-codex --type oauth --label <name>`

## Related

- #15104 — added the `response.status == "failed"` detection but only routes to cross-provider fallback
- #22212 — incomplete in-retry profile rotation (python platform layer, different code path)
- #23138 — eager fallback not firing for HTTP 402 (exception handler path, different error class)
- #4361 — fixed credential pool preservation through smart routing and deferred eager fallback on 429


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Codex Responses API soft failures (status=failed) bypass credential pool rotation #24159

Summary

Environment

Reproduction

Root cause

Path 1: HTTP Exception Handler (pool rotation WORKS)

Path 2: Invalid Response Handler (pool rotation DOES NOT FIRE)

Why the Responses API does this

Proposed fix

Workaround

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Bug] Codex Responses API soft failures (status=failed) bypass credential pool rotation #24159

Description

Summary

Environment

Reproduction

Root cause

Path 1: HTTP Exception Handler (pool rotation WORKS)

Path 2: Invalid Response Handler (pool rotation DOES NOT FIRE)

Why the Responses API does this

Proposed fix

Workaround

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions