Fallback provider 429 can exhaust primary provider credential pool

## Summary

After `openai-codex` failed and Hermes activated a cross-provider fallback, the fallback provider's `429` was recorded against the primary `openai-codex` OAuth credential pool.

This made a valid Codex OAuth session appear rate-limited/exhausted even though the quota/billing error came from the fallback provider.

Related context: #32903 / #32963 fixed the original `openai-codex` null-output streaming crash. This issue is about the secondary provider-state contamination after fallback activation, not the null-output parser crash itself.

## Environment

- Hermes Agent: `v0.14.0`, local checkout `2517917de` before pulling main
- OpenAI SDK in Hermes venv: `2.24.0`
- Primary provider/model: `openai-codex` / `gpt-5.5`
- Primary base URL: `https://chatgpt.com/backend-api/codex`
- Fallback provider/model: `zai` / `glm-5.1`

## Observed sequence

Sanitized local log sequence:

```text
2026-05-27 14:28:42 CST  Fallback activated: gpt-5.5 -> glm-5.1 (zai)
2026-05-27 14:28:43 CST  zai returned 429 code=1113 "Insufficient balance or no resource package"
2026-05-27 14:28:48 CST  credential pool: marking openai-codex-oauth-2 exhausted
```

After this, `hermes auth list` showed the `openai-codex` OAuth credentials as rate-limited/exhausted with the Z.AI `1113` error. That sent debugging in the wrong direction: it looked like a Codex OAuth/quota problem, while the actual `429` came from the fallback provider.

Resetting the `openai-codex` auth state cleared the stale exhausted status and the same Codex OAuth credentials worked again.

## Expected behavior

When fallback activation switches the active provider from `openai-codex` to `zai`, credential-pool recovery should not mutate the stale `openai-codex` pool based on a `zai` response.

Either:

- the active credential pool should be reloaded for the fallback provider, or
- the stale primary pool should be cleared before fallback calls continue, and
- recovery code should skip pool mutation if the pool provider does not match the current agent provider.

## Local patch shape that fixed the misattribution

I tested a narrow local patch with two guardrails:

1. In fallback activation, after assigning `agent.provider = fb_provider`, reload or clear `agent._credential_pool` for the fallback provider.

2. In `recover_with_credential_pool()`, before calling `mark_exhausted_and_rotate()`, skip credential-pool recovery when the loaded pool provider differs from the agent's current provider.

The important defensive check is:

```python
if pool_provider and current_provider and pool_provider != current_provider:
    return False, has_retried_429
```

## Audit notes

- The change is intentionally narrow: it only realigns the in-memory pool after a provider switch and adds a defensive provider-match check before any pool mutation.
- If the fallback provider has no credential pool, clearing `agent._credential_pool` is safer than retaining and mutating the stale primary pool.
- One caveat: if Hermes supports provider aliases for credential pools, this guard should probably use the same provider-name normalization used elsewhere. For the observed `openai-codex` -> `zai` fallback, the mismatch is unambiguous.

## Local validation

```text
pytest tests/run_agent/test_provider_fallback.py tests/agent/test_credential_pool_routing.py -q
# 34 passed

hermes auth reset openai-codex
hermes -z "只回复 OK" --provider openai-codex -m gpt-5.5 --ignore-rules
# OK
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fallback provider 429 can exhaust primary provider credential pool #33088

Summary

Environment

Observed sequence

Expected behavior

Local patch shape that fixed the misattribution

Audit notes

Local validation

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Fallback provider 429 can exhaust primary provider credential pool #33088

Description

Summary

Environment

Observed sequence

Expected behavior

Local patch shape that fixed the misattribution

Audit notes

Local validation

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions