fix: reduce credential exhaustion TTL from 24 hours to 1 hour#6504
Merged
Conversation
The 24-hour default cooldown for 402-exhausted credentials was far too aggressive — if a user tops up credits or the 402 was caused by an oversized max_tokens request rather than true billing exhaustion, they shouldn't have to wait a full day. Reduce to 1 hour (matching the existing 429 TTL). Inspired by PR #6493 (michalkomar).
Tommyeds
pushed a commit
to Tommyeds/hermes-agent
that referenced
this pull request
Apr 12, 2026
…search#6504) The 24-hour default cooldown for 402-exhausted credentials was far too aggressive — if a user tops up credits or the 402 was caused by an oversized max_tokens request rather than true billing exhaustion, they shouldn't have to wait a full day. Reduce to 1 hour (matching the existing 429 TTL). Inspired by PR NousResearch#6493 (michalkomar).
angelburgosrosado
pushed a commit
to angelburgosrosado/hermes-agent
that referenced
this pull request
Apr 27, 2026
…search#6504) The 24-hour default cooldown for 402-exhausted credentials was far too aggressive — if a user tops up credits or the 402 was caused by an oversized max_tokens request rather than true billing exhaustion, they shouldn't have to wait a full day. Reduce to 1 hour (matching the existing 429 TTL). Inspired by PR NousResearch#6493 (michalkomar).
02356abc
pushed a commit
to 02356abc/hermes-agent
that referenced
this pull request
May 14, 2026
…search#6504) The 24-hour default cooldown for 402-exhausted credentials was far too aggressive — if a user tops up credits or the 402 was caused by an oversized max_tokens request rather than true billing exhaustion, they shouldn't have to wait a full day. Reduce to 1 hour (matching the existing 429 TTL). Inspired by PR NousResearch#6493 (michalkomar).
olympus-terminal
pushed a commit
to olympus-terminal/hermes-agent
that referenced
this pull request
May 16, 2026
…search#6504) The 24-hour default cooldown for 402-exhausted credentials was far too aggressive — if a user tops up credits or the 402 was caused by an oversized max_tokens request rather than true billing exhaustion, they shouldn't have to wait a full day. Reduce to 1 hour (matching the existing 429 TTL). Inspired by PR NousResearch#6493 (michalkomar).
gweeteve
pushed a commit
to gweeteve/hermes-agent
that referenced
this pull request
Jun 2, 2026
…search#6504) The 24-hour default cooldown for 402-exhausted credentials was far too aggressive — if a user tops up credits or the 402 was caused by an oversized max_tokens request rather than true billing exhaustion, they shouldn't have to wait a full day. Reduce to 1 hour (matching the existing 429 TTL). Inspired by PR NousResearch#6493 (michalkomar).
Egavasyug
pushed a commit
to Egavasyug/hermes-agent
that referenced
this pull request
Jun 10, 2026
…search#6504) The 24-hour default cooldown for 402-exhausted credentials was far too aggressive — if a user tops up credits or the 402 was caused by an oversized max_tokens request rather than true billing exhaustion, they shouldn't have to wait a full day. Reduce to 1 hour (matching the existing 429 TTL). Inspired by PR NousResearch#6493 (michalkomar).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Reduces
EXHAUSTED_TTL_DEFAULT_SECONDSfrom 24 hours to 1 hour in the credential pool.The 24-hour default was far too aggressive — transient 402s (e.g. oversized max_tokens budget vs remaining credits) would poison a credential for an entire day even though it's still perfectly usable for normal requests. 1 hour is enough cooldown for genuine billing exhaustion while recovering quickly from transient issues.
Changes
agent/credential_pool.py:EXHAUSTED_TTL_DEFAULT_SECONDS: 24h → 1htests/agent/test_credential_pool.py: Addedtest_exhausted_402_entry_resets_after_one_hourTest plan
Inspired by #6493 (credit to @michalkomar for identifying the problem).