[Feature]: Multi-provider credential pools for cross-provider failover and rotation

## Problem or Use Case

Hermes credential pools are currently scoped to a **single provider**. Each pool key (e.g., `"custom:local"`, `"openai-codex"`, `"copilot"`) maps to credentials that connect to one API endpoint. When all credentials in a pool are exhausted (rate-limited, billing-quota-hit, or auth-failed), the agent errors out even if other providers have available capacity with compatible models.

This is painful for users who:

1. **Pay for multiple providers** (OpenRouter, OpenAI, Anthropic, custom endpoints) and want to use them interchangeably for the same model family (e.g., GPT-5.4 via OpenRouter vs. OpenAI direct).
2. **Run custom providers** that serve the same model through different gateways (e.g., `ollama.com/v1` and a local Ollama instance both serving `glm-5.1`) and want automatic failover between them.
3. **Want cost optimization** by routing to the cheapest available provider for a given model without manual config changes.
4. **Need resilience** — when one provider goes down or rate-limits, seamlessly fall through to the next provider serving a compatible model.

Currently, `fallback_model`/`fallback_providers` provides provider-level failover, but it requires the agent to fully fail before trying the next provider. It does not pool credentials across providers for smart selection within a single turn. The `provider_routing` config (`only`/`ignore`/`order`) restricts which providers OpenRouter uses — it does not bridge across different provider types.

What is needed is a way to define a **multi-provider credential pool** that groups credentials from different providers (potentially with different `base_url`, `api_mode`, and `api_key` values) into a single failover-capable pool that the `CredentialPool` class can rotate through.

## Proposed Solution

### Config-level: Add `pools` section to `config.yaml`

```yaml
pools:
  openai-compatible:
    strategy: round_robin      # fill_first | round_robin | random | least_used
    entries:
      - provider: openai-codex
        label: "Codex Plus"
        # Uses pool entry from auth.json (openai-codex)

      - provider: custom
        base_url: https://api.openai.com/v1
        api_key_env: OPENAI_API_KEY_DIRECT
        label: "OpenAI Direct"
        api_mode: chat_completions

      - provider: custom
        base_url: https://ollama.com/v1
        pool: custom:local      # Reference existing custom:local pool from auth.json
        label: "Ollama Cloud"
        api_mode: chat_completions

  anthropic-compatible:
    strategy: fill_first
    entries:
      - provider: anthropic
        # Uses pool entry from auth.json (anthropic)

      - provider: custom
        base_url: https://opencode.ai/zen/go/v1
        pool: custom:opencode-go
        label: "OpenCode Go"
        api_mode: anthropic_messages
```

### Model config integration

```yaml
model:
  default: gpt-5.4
  provider: pool:openai-compatible   # Route through a multi-provider pool

  # Or, for single-model failover:
  fallback_providers:
    - provider: pool:openai-compatible
      model: gpt-5.4
```

### Runtime behavior

When `provider` resolves to a `pool:` reference, `resolve_runtime_provider()` would:

1. Load the named pool definition from config
2. Iterate entries using the pool strategy (round_robin, least_used, etc.)
3. For each entry, resolve credentials (pool reference from auth.json, env var, or explicit key)
4. If the selected entry fails with an auth/rate-limit error, mark it exhausted and rotate to the next entry
5. Return the full runtime dict (`provider`, `api_key`, `base_url`, `api_mode`, `credential_pool`) just like a single-provider pool does today

### Implementation approach

The core change is in `hermes_cli/runtime_provider.py` — when `resolve_runtime_provider()` receives a `pool:<name>` provider string:

- Load the pool definition from config
- Create a `MultiProviderPool` that wraps multiple `CredentialPool` instances (or inline entries)
- `MultiProviderPool.select()` iterates entries using the configured strategy
- On `mark_exhausted_and_rotate()`, advance to the next entry in the pool
- Each entry contributes a fully-resolved `provider`, `base_url`, `api_key`, `api_mode` tuple

The existing `CredentialPool` class stays unchanged — single-provider pools continue working as before. `MultiProviderPool` is additive.

### Key challenges

- **Model compatibility**: A pool entry should declare which models it serves, so the agent does not route a `gpt-5.4` request to an Anthropic-only endpoint. Could use the existing `custom_providers.models` config pattern.
- **api_mode bridging**: Entries in the same pool may need different `api_mode` values (e.g., `chat_completions` for OpenAI-compatible, `anthropic_messages` for Claude endpoints). The pool entry must carry its own `api_mode`.
- **Error classification**: Not all errors should trigger pool rotation — only auth (401/403), rate-limit (429), and billing/quota (402) errors. Server errors (500/502/503) should use retry logic instead.

## Alternatives Considered

1. **Use `fallback_providers` for cross-provider failover** — This already exists but requires a full agent failure before trying the next provider. It does not pre-emptively rotate credentials or do smart selection. It is also a separate config that does not integrate with pool strategies.

2. **Multiple `custom_providers` entries pointing to different providers** — The current `custom_providers` list only defines endpoint metadata and model lists. It does not create a failover pool. Each custom provider still resolves to one pool key in `auth.json`.

3. **Write a custom hook that catches errors and re-resolves** — The hook system (`gateway/hooks.py`) fires `agent:end` events but cannot modify the in-flight request or switch providers mid-turn.

4. **Shell script wrapper** — An external tool cannot inject credentials into a running Hermes agent mid-session.

## Feature Type

Gateway / messaging improvement

## Scope

Medium (2-3 files, ~300-500 lines)

## Additional Context

I investigated the full credential pool and runtime provider resolution pipeline:

- `agent/credential_pool.py` — `CredentialPool` class is provider-scoped (keyed by provider name like `"openai-codex"`, `"custom:local"`)
- `agent/credential_pool.py:346` — `get_pool_strategy()` reads `credential_pool_strategies` from config, accepts `fill_first`, `round_robin`, `random`, `least_used`
- `hermes_cli/runtime_provider.py:704` — `resolve_runtime_provider()` loads a single pool via `load_pool(provider)` and returns a single `credential_pool` reference
- `hermes_cli/runtime_provider.py:170-230` — `_resolve_runtime_from_pool_entry()` builds the runtime dict with `provider`, `base_url`, `api_key`, `api_mode`, `credential_pool`
- `run_agent.py:611-1088` — `AIAgent.__init__` accepts `fallback_model` (single dict or list of provider dicts), but this is a separate failover chain, not pool-integrated
- Existing `provider_routing` config only controls OpenRouter routing preferences, not cross-provider pooling

The `custom:local` pool in our deployment uses `round_robin` strategy across 4 Ollama API keys, which works great for same-provider key rotation. Extending this concept across providers is the natural next step.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Multi-provider credential pools for cross-provider failover and rotation #11737

Problem or Use Case

Proposed Solution

Config-level: Add `pools` section to `config.yaml`

Model config integration

Runtime behavior

Implementation approach

Key challenges

Alternatives Considered

Feature Type

Scope

Additional Context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Feature]: Multi-provider credential pools for cross-provider failover and rotation #11737

Description

Problem or Use Case

Proposed Solution

Config-level: Add pools section to config.yaml

Model config integration

Runtime behavior

Implementation approach

Key challenges

Alternatives Considered

Feature Type

Scope

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Config-level: Add `pools` section to `config.yaml`