Skip to content

Credential pool: no exponential backoff on repeated exhaustion — flat TTL causes 429 retry loops #15296

@mordekai-lab

Description

@mordekai-lab

Summary

The credential pool uses a flat TTL for exhaustion cooldown regardless of how many consecutive times a credential has failed. When a provider is overloaded for an extended period (hours), the pool cycles through a predictable loop:

  1. TTL expires → credential cleared to ok
  2. Provider tried again → fails with same error (3 retries wasted)
  3. Credential re-exhausted with same flat TTL
  4. Wait for TTL to expire again → repeat

This is particularly painful for cron jobs (each run = one turn), which burn retries on every execution until the provider recovers.

Affected Code

agent/credential_pool.py_exhausted_ttl() (line ~190):

def _exhausted_ttl(error_code: Optional[int]) -> int:
    if error_code == 429:
        return EXHAUSTED_TTL_429_SECONDS  # always 3600s
    return EXHAUSTED_TTL_DEFAULT_SECONDS  # always 3600s

The TTL is always 3600s regardless of failure count. No exponential backoff exists.

Proposed Fix

Track consecutive failure count on PooledCredential entries and apply exponential backoff:

  • Add consecutive_failures: int = 0 to the PooledCredential dataclass
  • Increment in _mark_exhausted(), reset to 0 when clearing back to ok
  • Update _exhausted_ttl() to use base_ttl * min(2^(failures-1), 8), capped at 8 hours

Example progression: 1h → 2h → 4h → 8h → 8h (cap)

This way, a transient 429 still gets a short 1h cooldown, but sustained outages back off progressively instead of hammering the provider every hour.

Environment

  • Hermes-agent latest (commit HEAD)
  • Gateway mode with credential pool enabled
  • Observed with a provider returning server-side 429 overload errors

Metadata

Metadata

Assignees

No one assigned

    Labels

    P3Low — cosmetic, nice to havecomp/agentCore agent loop, run_agent.py, prompt buildertype/featureNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions