Skip to content

fix: avoid poisoning credentials on soft 402 errors#6493

Closed
michalkomar wants to merge 1 commit into
NousResearch:mainfrom
michalkomar:fix/auxiliary-402-affordability
Closed

fix: avoid poisoning credentials on soft 402 errors#6493
michalkomar wants to merge 1 commit into
NousResearch:mainfrom
michalkomar:fix/auxiliary-402-affordability

Conversation

@michalkomar

Copy link
Copy Markdown

Summary:

  • cap default auxiliary output budgets to avoid oversized helper requests
  • treat OpenRouter affordability 402 errors as soft failures instead of exhausting the credential for 24h
  • add regression tests for both behaviors

Validation:

  • python -m pytest tests/agent/test_auxiliary_client.py -q
  • python -m pytest tests/run_agent/test_run_agent.py -q

Note:

  • Full tests/ run on this machine hit unrelated 'too many open files' errors outside the changed paths.

teknium1 added a commit that referenced this pull request Apr 9, 2026
The 24-hour default cooldown for 402-exhausted credentials was far too
aggressive — if a user tops up credits or the 402 was caused by an
oversized max_tokens request rather than true billing exhaustion, they
shouldn't have to wait a full day. Reduce to 1 hour (matching the
existing 429 TTL).

Inspired by PR #6493 (michalkomar).
teknium1 added a commit that referenced this pull request Apr 9, 2026
The 24-hour default cooldown for 402-exhausted credentials was far too
aggressive — if a user tops up credits or the 402 was caused by an
oversized max_tokens request rather than true billing exhaustion, they
shouldn't have to wait a full day. Reduce to 1 hour (matching the
existing 429 TTL).

Inspired by PR #6493 (michalkomar).
@teknium1

teknium1 commented Apr 9, 2026

Copy link
Copy Markdown
Contributor

Closed in favor of PR #6504, which reduces the credential exhaustion TTL from 24 hours to 1 hour — the core issue you identified. The max_tokens capping approach wasn't the right fix (auxiliary tasks like compression legitimately need large output budgets), but your observation that credentials get poisoned too aggressively on 402s was spot on. Thanks @michalkomar!

@teknium1 teknium1 closed this Apr 9, 2026
Tommyeds pushed a commit to Tommyeds/hermes-agent that referenced this pull request Apr 12, 2026
…search#6504)

The 24-hour default cooldown for 402-exhausted credentials was far too
aggressive — if a user tops up credits or the 402 was caused by an
oversized max_tokens request rather than true billing exhaustion, they
shouldn't have to wait a full day. Reduce to 1 hour (matching the
existing 429 TTL).

Inspired by PR NousResearch#6493 (michalkomar).
angelburgosrosado pushed a commit to angelburgosrosado/hermes-agent that referenced this pull request Apr 27, 2026
…search#6504)

The 24-hour default cooldown for 402-exhausted credentials was far too
aggressive — if a user tops up credits or the 402 was caused by an
oversized max_tokens request rather than true billing exhaustion, they
shouldn't have to wait a full day. Reduce to 1 hour (matching the
existing 429 TTL).

Inspired by PR NousResearch#6493 (michalkomar).
angelburgosrosado pushed a commit to angelburgosrosado/hermes-agent that referenced this pull request Apr 28, 2026
The 24-hour default cooldown for 402-exhausted credentials was far too
aggressive — if a user tops up credits or the 402 was caused by an
oversized max_tokens request rather than true billing exhaustion, they
shouldn't have to wait a full day. Reduce to 1 hour (matching the
existing 429 TTL).

Inspired by PR NousResearch#6493 (michalkomar).
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
…search#6504)

The 24-hour default cooldown for 402-exhausted credentials was far too
aggressive — if a user tops up credits or the 402 was caused by an
oversized max_tokens request rather than true billing exhaustion, they
shouldn't have to wait a full day. Reduce to 1 hour (matching the
existing 429 TTL).

Inspired by PR NousResearch#6493 (michalkomar).
olympus-terminal pushed a commit to olympus-terminal/hermes-agent that referenced this pull request May 16, 2026
…search#6504)

The 24-hour default cooldown for 402-exhausted credentials was far too
aggressive — if a user tops up credits or the 402 was caused by an
oversized max_tokens request rather than true billing exhaustion, they
shouldn't have to wait a full day. Reduce to 1 hour (matching the
existing 429 TTL).

Inspired by PR NousResearch#6493 (michalkomar).
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
…search#6504)

The 24-hour default cooldown for 402-exhausted credentials was far too
aggressive — if a user tops up credits or the 402 was caused by an
oversized max_tokens request rather than true billing exhaustion, they
shouldn't have to wait a full day. Reduce to 1 hour (matching the
existing 429 TTL).

Inspired by PR NousResearch#6493 (michalkomar).
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
…search#6504)

The 24-hour default cooldown for 402-exhausted credentials was far too
aggressive — if a user tops up credits or the 402 was caused by an
oversized max_tokens request rather than true billing exhaustion, they
shouldn't have to wait a full day. Reduce to 1 hour (matching the
existing 429 TTL).

Inspired by PR NousResearch#6493 (michalkomar).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants