Skip to content

fix: auxiliary client uses placeholder key for local servers without auth#3842

Merged
teknium1 merged 1 commit into
mainfrom
hermes/hermes-3d9ccfec
Mar 30, 2026
Merged

fix: auxiliary client uses placeholder key for local servers without auth#3842
teknium1 merged 1 commit into
mainfrom
hermes/hermes-3d9ccfec

Conversation

@teknium1

Copy link
Copy Markdown
Contributor

Problem

Users running local inference servers (Ollama, llama.cpp, vLLM, LM Studio) without any cloud API keys get broken auxiliary operations — compression, summarization, and memory flush all fail because the auxiliary client skips their local server.

Root cause: _resolve_custom_runtime() in auxiliary_client.py requires both a base_url AND a non-empty api_key. Local servers don't need auth, so the key is empty → the function returns (None, None) → the auto-detection chain exhausts all options → None → timeouts and errors.

The main CLI already fixed this in PR #2556 with a "no-key-required" placeholder, but the auxiliary client's resolution path was never updated.

Symptoms in gateway logs:

WARNING resolve_provider_client: openrouter requested but OPENROUTER_API_KEY not set
WARNING Failed to generate context summary: Request timed out.
WARNING Session summarization failed after 3 attempts: Request timed out.

Fix

  • _resolve_custom_runtime(): use "no-key-required" placeholder when base_url is present but key is empty (matches cli.py pattern)
  • resolve_provider_client() custom branch: same placeholder fallback for explicit_base_url without explicit_api_key
  • Updated 2 tests that expected the old (broken) reject behavior

Test plan

…auth

Local inference servers (Ollama, llama.cpp, vLLM, LM Studio) don't
require API keys, but the auxiliary client's _resolve_custom_runtime()
rejected endpoints with empty keys — causing the auto-detection chain
to skip the user's local server entirely.  This broke compression,
summarization, and memory flush for users running local models without
an OpenRouter/cloud API key.

The main CLI already had this fix (PR #2556, 'no-key-required'
placeholder), but the auxiliary client's resolution path was missed.

Two fixes:
- _resolve_custom_runtime(): use 'no-key-required' placeholder instead
  of returning None when base_url is present but key is empty
- resolve_provider_client() custom branch: same placeholder fallback
  for explicit_base_url without explicit_api_key

Updates 2 tests that expected the old (broken) behavior.
@teknium1 teknium1 merged commit 3cc5053 into main Mar 30, 2026
2 of 3 checks passed
angelburgosrosado pushed a commit to angelburgosrosado/hermes-agent that referenced this pull request Apr 27, 2026
…auth (NousResearch#3842)

Local inference servers (Ollama, llama.cpp, vLLM, LM Studio) don't
require API keys, but the auxiliary client's _resolve_custom_runtime()
rejected endpoints with empty keys — causing the auto-detection chain
to skip the user's local server entirely.  This broke compression,
summarization, and memory flush for users running local models without
an OpenRouter/cloud API key.

The main CLI already had this fix (PR NousResearch#2556, 'no-key-required'
placeholder), but the auxiliary client's resolution path was missed.

Two fixes:
- _resolve_custom_runtime(): use 'no-key-required' placeholder instead
  of returning None when base_url is present but key is empty
- resolve_provider_client() custom branch: same placeholder fallback
  for explicit_base_url without explicit_api_key

Updates 2 tests that expected the old (broken) behavior.
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
…auth (NousResearch#3842)

Local inference servers (Ollama, llama.cpp, vLLM, LM Studio) don't
require API keys, but the auxiliary client's _resolve_custom_runtime()
rejected endpoints with empty keys — causing the auto-detection chain
to skip the user's local server entirely.  This broke compression,
summarization, and memory flush for users running local models without
an OpenRouter/cloud API key.

The main CLI already had this fix (PR NousResearch#2556, 'no-key-required'
placeholder), but the auxiliary client's resolution path was missed.

Two fixes:
- _resolve_custom_runtime(): use 'no-key-required' placeholder instead
  of returning None when base_url is present but key is empty
- resolve_provider_client() custom branch: same placeholder fallback
  for explicit_base_url without explicit_api_key

Updates 2 tests that expected the old (broken) behavior.
olympus-terminal pushed a commit to olympus-terminal/hermes-agent that referenced this pull request May 16, 2026
…auth (NousResearch#3842)

Local inference servers (Ollama, llama.cpp, vLLM, LM Studio) don't
require API keys, but the auxiliary client's _resolve_custom_runtime()
rejected endpoints with empty keys — causing the auto-detection chain
to skip the user's local server entirely.  This broke compression,
summarization, and memory flush for users running local models without
an OpenRouter/cloud API key.

The main CLI already had this fix (PR NousResearch#2556, 'no-key-required'
placeholder), but the auxiliary client's resolution path was missed.

Two fixes:
- _resolve_custom_runtime(): use 'no-key-required' placeholder instead
  of returning None when base_url is present but key is empty
- resolve_provider_client() custom branch: same placeholder fallback
  for explicit_base_url without explicit_api_key

Updates 2 tests that expected the old (broken) behavior.
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
…auth (NousResearch#3842)

Local inference servers (Ollama, llama.cpp, vLLM, LM Studio) don't
require API keys, but the auxiliary client's _resolve_custom_runtime()
rejected endpoints with empty keys — causing the auto-detection chain
to skip the user's local server entirely.  This broke compression,
summarization, and memory flush for users running local models without
an OpenRouter/cloud API key.

The main CLI already had this fix (PR NousResearch#2556, 'no-key-required'
placeholder), but the auxiliary client's resolution path was missed.

Two fixes:
- _resolve_custom_runtime(): use 'no-key-required' placeholder instead
  of returning None when base_url is present but key is empty
- resolve_provider_client() custom branch: same placeholder fallback
  for explicit_base_url without explicit_api_key

Updates 2 tests that expected the old (broken) behavior.
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
…auth (NousResearch#3842)

Local inference servers (Ollama, llama.cpp, vLLM, LM Studio) don't
require API keys, but the auxiliary client's _resolve_custom_runtime()
rejected endpoints with empty keys — causing the auto-detection chain
to skip the user's local server entirely.  This broke compression,
summarization, and memory flush for users running local models without
an OpenRouter/cloud API key.

The main CLI already had this fix (PR NousResearch#2556, 'no-key-required'
placeholder), but the auxiliary client's resolution path was missed.

Two fixes:
- _resolve_custom_runtime(): use 'no-key-required' placeholder instead
  of returning None when base_url is present but key is empty
- resolve_provider_client() custom branch: same placeholder fallback
  for explicit_base_url without explicit_api_key

Updates 2 tests that expected the old (broken) behavior.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant