Summary
The credential pool uses a flat TTL for exhaustion cooldown regardless of how many consecutive times a credential has failed. When a provider is overloaded for an extended period (hours), the pool cycles through a predictable loop:
- TTL expires → credential cleared to
ok
- Provider tried again → fails with same error (3 retries wasted)
- Credential re-exhausted with same flat TTL
- Wait for TTL to expire again → repeat
This is particularly painful for cron jobs (each run = one turn), which burn retries on every execution until the provider recovers.
Affected Code
agent/credential_pool.py — _exhausted_ttl() (line ~190):
def _exhausted_ttl(error_code: Optional[int]) -> int:
if error_code == 429:
return EXHAUSTED_TTL_429_SECONDS # always 3600s
return EXHAUSTED_TTL_DEFAULT_SECONDS # always 3600s
The TTL is always 3600s regardless of failure count. No exponential backoff exists.
Proposed Fix
Track consecutive failure count on PooledCredential entries and apply exponential backoff:
- Add
consecutive_failures: int = 0 to the PooledCredential dataclass
- Increment in
_mark_exhausted(), reset to 0 when clearing back to ok
- Update
_exhausted_ttl() to use base_ttl * min(2^(failures-1), 8), capped at 8 hours
Example progression: 1h → 2h → 4h → 8h → 8h (cap)
This way, a transient 429 still gets a short 1h cooldown, but sustained outages back off progressively instead of hammering the provider every hour.
Environment
- Hermes-agent latest (commit HEAD)
- Gateway mode with credential pool enabled
- Observed with a provider returning server-side 429 overload errors
Summary
The credential pool uses a flat TTL for exhaustion cooldown regardless of how many consecutive times a credential has failed. When a provider is overloaded for an extended period (hours), the pool cycles through a predictable loop:
okThis is particularly painful for cron jobs (each run = one turn), which burn retries on every execution until the provider recovers.
Affected Code
agent/credential_pool.py—_exhausted_ttl()(line ~190):The TTL is always 3600s regardless of failure count. No exponential backoff exists.
Proposed Fix
Track consecutive failure count on
PooledCredentialentries and apply exponential backoff:consecutive_failures: int = 0to thePooledCredentialdataclass_mark_exhausted(), reset to 0 when clearing back took_exhausted_ttl()to usebase_ttl * min(2^(failures-1), 8), capped at 8 hoursExample progression: 1h → 2h → 4h → 8h → 8h (cap)
This way, a transient 429 still gets a short 1h cooldown, but sustained outages back off progressively instead of hammering the provider every hour.
Environment