Skip to content

fix: misleading error when auxiliary provider credential pool is exhausted#21428

Open
li0near wants to merge 1 commit into
NousResearch:mainfrom
li0near:fix/auxiliary-exhausted-pool-error-message
Open

fix: misleading error when auxiliary provider credential pool is exhausted#21428
li0near wants to merge 1 commit into
NousResearch:mainfrom
li0near:fix/auxiliary-exhausted-pool-error-message

Conversation

@li0near

@li0near li0near commented May 7, 2026

Copy link
Copy Markdown
Contributor

Problem

When a provider's credential pool has all entries marked exhausted (e.g. from a prior 401 against a proxy), auxiliary tasks (compression, title generation, vision, etc.) fail with:

Provider 'anthropic' is set in config.yaml but no API key was found.
Set the ANTHROPIC_API_KEY environment variable, or switch to a different provider with `hermes model`.

This is misleading — the API key is configured and loads correctly, but the credential pool has a stale exhaustion marker that blocks the client from ever attempting to use it. The user has no way to diagnose this without inspecting auth.json manually.

Root Cause

_try_anthropic() returns (None, None) silently when _select_pool_entry() returns (True, None) (pool exists, all entries exhausted). The downstream call_llm() / async_call_llm() then raises a generic "no API key" error that points to the wrong fix.

Fix

  1. _try_anthropic(): Add logger.warning() when pool exists but all entries are exhausted, mentioning hermes auth reset anthropic as the resolution.
  2. call_llm() and async_call_llm(): Before raising the generic "no API key" RuntimeError, check if the provider has an exhausted pool. If so, raise a more accurate error pointing to hermes auth reset <provider>.

After

Provider 'anthropic' credential pool is exhausted (all entries marked failed).
Run `hermes auth reset anthropic` to clear exhaustion state and retry.

Reproduction

  1. Configure auxiliary.title_generation.provider: anthropic with a custom proxy
  2. Trigger a 401 (e.g. proxy rejects key format on first connection)
  3. Pool entry gets marked exhausted permanently in auth.json
  4. All subsequent sessions show the misleading "no API key" warning for every auxiliary task

Related: #10476 (silent provider fallback), but distinct — this is about the error message accuracy, not fallback notification.

@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder area/auth Authentication, OAuth, credential pools labels May 7, 2026
… exhausted

When a provider's credential pool has all entries marked exhausted (e.g.
from a prior 401), the auxiliary client returns None and the downstream
error message incorrectly tells the user 'no API key was found' — pointing
them toward missing env vars when the real issue is stale pool state.

This change:
- Adds a logger.warning in _try_anthropic() when pool exists but all
  entries are exhausted, mentioning `hermes auth reset` as the fix.
- In both call_llm() and async_call_llm(), checks for exhausted pool
  before raising the generic 'no API key' RuntimeError, and raises a
  more accurate message pointing to `hermes auth reset <provider>`.

Fixes the case where a one-time auth failure (e.g. proxy rejecting a
key on first run) permanently poisons the credential pool and all
subsequent auxiliary calls (compression, title generation, vision, etc.)
fail with misleading error messages.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/auth Authentication, OAuth, credential pools comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants