Skip to content

fix: reduce credential exhaustion TTL from 24 hours to 1 hour#6504

Merged
teknium1 merged 1 commit into
mainfrom
hermes/hermes-2447adad
Apr 9, 2026
Merged

fix: reduce credential exhaustion TTL from 24 hours to 1 hour#6504
teknium1 merged 1 commit into
mainfrom
hermes/hermes-2447adad

Conversation

@teknium1

@teknium1 teknium1 commented Apr 9, 2026

Copy link
Copy Markdown
Contributor

Summary

Reduces EXHAUSTED_TTL_DEFAULT_SECONDS from 24 hours to 1 hour in the credential pool.

The 24-hour default was far too aggressive — transient 402s (e.g. oversized max_tokens budget vs remaining credits) would poison a credential for an entire day even though it's still perfectly usable for normal requests. 1 hour is enough cooldown for genuine billing exhaustion while recovering quickly from transient issues.

Changes

  • agent/credential_pool.py: EXHAUSTED_TTL_DEFAULT_SECONDS: 24h → 1h
  • tests/agent/test_credential_pool.py: Added test_exhausted_402_entry_resets_after_one_hour

Test plan

python -m pytest tests/agent/test_credential_pool.py -q
python -m pytest tests/agent/test_credential_pool_routing.py tests/run_agent/test_run_agent.py::TestCredentialPoolRecovery -q

Inspired by #6493 (credit to @michalkomar for identifying the problem).

The 24-hour default cooldown for 402-exhausted credentials was far too
aggressive — if a user tops up credits or the 402 was caused by an
oversized max_tokens request rather than true billing exhaustion, they
shouldn't have to wait a full day. Reduce to 1 hour (matching the
existing 429 TTL).

Inspired by PR #6493 (michalkomar).
@teknium1 teknium1 merged commit b408379 into main Apr 9, 2026
2 of 4 checks passed
Tommyeds pushed a commit to Tommyeds/hermes-agent that referenced this pull request Apr 12, 2026
…search#6504)

The 24-hour default cooldown for 402-exhausted credentials was far too
aggressive — if a user tops up credits or the 402 was caused by an
oversized max_tokens request rather than true billing exhaustion, they
shouldn't have to wait a full day. Reduce to 1 hour (matching the
existing 429 TTL).

Inspired by PR NousResearch#6493 (michalkomar).
angelburgosrosado pushed a commit to angelburgosrosado/hermes-agent that referenced this pull request Apr 27, 2026
…search#6504)

The 24-hour default cooldown for 402-exhausted credentials was far too
aggressive — if a user tops up credits or the 402 was caused by an
oversized max_tokens request rather than true billing exhaustion, they
shouldn't have to wait a full day. Reduce to 1 hour (matching the
existing 429 TTL).

Inspired by PR NousResearch#6493 (michalkomar).
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
…search#6504)

The 24-hour default cooldown for 402-exhausted credentials was far too
aggressive — if a user tops up credits or the 402 was caused by an
oversized max_tokens request rather than true billing exhaustion, they
shouldn't have to wait a full day. Reduce to 1 hour (matching the
existing 429 TTL).

Inspired by PR NousResearch#6493 (michalkomar).
olympus-terminal pushed a commit to olympus-terminal/hermes-agent that referenced this pull request May 16, 2026
…search#6504)

The 24-hour default cooldown for 402-exhausted credentials was far too
aggressive — if a user tops up credits or the 402 was caused by an
oversized max_tokens request rather than true billing exhaustion, they
shouldn't have to wait a full day. Reduce to 1 hour (matching the
existing 429 TTL).

Inspired by PR NousResearch#6493 (michalkomar).
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
…search#6504)

The 24-hour default cooldown for 402-exhausted credentials was far too
aggressive — if a user tops up credits or the 402 was caused by an
oversized max_tokens request rather than true billing exhaustion, they
shouldn't have to wait a full day. Reduce to 1 hour (matching the
existing 429 TTL).

Inspired by PR NousResearch#6493 (michalkomar).
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
…search#6504)

The 24-hour default cooldown for 402-exhausted credentials was far too
aggressive — if a user tops up credits or the 402 was caused by an
oversized max_tokens request rather than true billing exhaustion, they
shouldn't have to wait a full day. Reduce to 1 hour (matching the
existing 429 TTL).

Inspired by PR NousResearch#6493 (michalkomar).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant