Skip to content

fix(cli): allow custom/local endpoints without API key#2556

Merged
teknium1 merged 1 commit into
mainfrom
hermes/hermes-fdcb4c4a
Mar 22, 2026
Merged

fix(cli): allow custom/local endpoints without API key#2556
teknium1 merged 1 commit into
mainfrom
hermes/hermes-fdcb4c4a

Conversation

@teknium1

Copy link
Copy Markdown
Contributor

Problem

Users running local LLM servers (llama.cpp, ollama, vLLM) get:

Provider resolver returned an empty API key.

This broke after the runtime provider resolver was tightened to enforce non-empty API keys. Local servers don't need auth.

Fix

In _ensure_runtime_credentials(): when a custom base_url IS configured but no API key was found, use a placeholder ("no-key-required") instead of failing. Local servers ignore the Authorization header.

Only applies when the base URL is NOT an OpenRouter URL (those always need real keys).

Reported by @ThatWolfieGuy — llama.cpp with Qwen 3.5 stopped working after updating to v0.4.0.

Local LLM servers (llama.cpp, ollama, vLLM, etc.) typically don't
require authentication. When a custom base_url is configured but no
API key is found, use a placeholder instead of failing with
'Provider resolver returned an empty API key.'

The OpenAI SDK accepts any string as api_key, and local servers
simply ignore the Authorization header.

Fixes issue reported by @ThatWolfieGuy — llama.cpp stopped working
after updating because the new runtime provider resolver enforces
non-empty API keys even for keyless local endpoints.
@teknium1 teknium1 merged commit 5ddb6a1 into main Mar 22, 2026
1 check passed
@promptengines

Copy link
Copy Markdown

lol just what i was looking for

teknium1 added a commit that referenced this pull request Mar 30, 2026
…auth

Local inference servers (Ollama, llama.cpp, vLLM, LM Studio) don't
require API keys, but the auxiliary client's _resolve_custom_runtime()
rejected endpoints with empty keys — causing the auto-detection chain
to skip the user's local server entirely.  This broke compression,
summarization, and memory flush for users running local models without
an OpenRouter/cloud API key.

The main CLI already had this fix (PR #2556, 'no-key-required'
placeholder), but the auxiliary client's resolution path was missed.

Two fixes:
- _resolve_custom_runtime(): use 'no-key-required' placeholder instead
  of returning None when base_url is present but key is empty
- resolve_provider_client() custom branch: same placeholder fallback
  for explicit_base_url without explicit_api_key

Updates 2 tests that expected the old (broken) behavior.
teknium1 added a commit that referenced this pull request Mar 30, 2026
…auth (#3842)

Local inference servers (Ollama, llama.cpp, vLLM, LM Studio) don't
require API keys, but the auxiliary client's _resolve_custom_runtime()
rejected endpoints with empty keys — causing the auto-detection chain
to skip the user's local server entirely.  This broke compression,
summarization, and memory flush for users running local models without
an OpenRouter/cloud API key.

The main CLI already had this fix (PR #2556, 'no-key-required'
placeholder), but the auxiliary client's resolution path was missed.

Two fixes:
- _resolve_custom_runtime(): use 'no-key-required' placeholder instead
  of returning None when base_url is present but key is empty
- resolve_provider_client() custom branch: same placeholder fallback
  for explicit_base_url without explicit_api_key

Updates 2 tests that expected the old (broken) behavior.
angelburgosrosado pushed a commit to angelburgosrosado/hermes-agent that referenced this pull request Apr 27, 2026
…fdcb4c4a

fix(cli): allow custom/local endpoints without API key
angelburgosrosado pushed a commit to angelburgosrosado/hermes-agent that referenced this pull request Apr 27, 2026
…auth (NousResearch#3842)

Local inference servers (Ollama, llama.cpp, vLLM, LM Studio) don't
require API keys, but the auxiliary client's _resolve_custom_runtime()
rejected endpoints with empty keys — causing the auto-detection chain
to skip the user's local server entirely.  This broke compression,
summarization, and memory flush for users running local models without
an OpenRouter/cloud API key.

The main CLI already had this fix (PR NousResearch#2556, 'no-key-required'
placeholder), but the auxiliary client's resolution path was missed.

Two fixes:
- _resolve_custom_runtime(): use 'no-key-required' placeholder instead
  of returning None when base_url is present but key is empty
- resolve_provider_client() custom branch: same placeholder fallback
  for explicit_base_url without explicit_api_key

Updates 2 tests that expected the old (broken) behavior.
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
…fdcb4c4a

fix(cli): allow custom/local endpoints without API key
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
…auth (NousResearch#3842)

Local inference servers (Ollama, llama.cpp, vLLM, LM Studio) don't
require API keys, but the auxiliary client's _resolve_custom_runtime()
rejected endpoints with empty keys — causing the auto-detection chain
to skip the user's local server entirely.  This broke compression,
summarization, and memory flush for users running local models without
an OpenRouter/cloud API key.

The main CLI already had this fix (PR NousResearch#2556, 'no-key-required'
placeholder), but the auxiliary client's resolution path was missed.

Two fixes:
- _resolve_custom_runtime(): use 'no-key-required' placeholder instead
  of returning None when base_url is present but key is empty
- resolve_provider_client() custom branch: same placeholder fallback
  for explicit_base_url without explicit_api_key

Updates 2 tests that expected the old (broken) behavior.
olympus-terminal pushed a commit to olympus-terminal/hermes-agent that referenced this pull request May 16, 2026
…fdcb4c4a

fix(cli): allow custom/local endpoints without API key
olympus-terminal pushed a commit to olympus-terminal/hermes-agent that referenced this pull request May 16, 2026
…auth (NousResearch#3842)

Local inference servers (Ollama, llama.cpp, vLLM, LM Studio) don't
require API keys, but the auxiliary client's _resolve_custom_runtime()
rejected endpoints with empty keys — causing the auto-detection chain
to skip the user's local server entirely.  This broke compression,
summarization, and memory flush for users running local models without
an OpenRouter/cloud API key.

The main CLI already had this fix (PR NousResearch#2556, 'no-key-required'
placeholder), but the auxiliary client's resolution path was missed.

Two fixes:
- _resolve_custom_runtime(): use 'no-key-required' placeholder instead
  of returning None when base_url is present but key is empty
- resolve_provider_client() custom branch: same placeholder fallback
  for explicit_base_url without explicit_api_key

Updates 2 tests that expected the old (broken) behavior.
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
…fdcb4c4a

fix(cli): allow custom/local endpoints without API key
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
…auth (NousResearch#3842)

Local inference servers (Ollama, llama.cpp, vLLM, LM Studio) don't
require API keys, but the auxiliary client's _resolve_custom_runtime()
rejected endpoints with empty keys — causing the auto-detection chain
to skip the user's local server entirely.  This broke compression,
summarization, and memory flush for users running local models without
an OpenRouter/cloud API key.

The main CLI already had this fix (PR NousResearch#2556, 'no-key-required'
placeholder), but the auxiliary client's resolution path was missed.

Two fixes:
- _resolve_custom_runtime(): use 'no-key-required' placeholder instead
  of returning None when base_url is present but key is empty
- resolve_provider_client() custom branch: same placeholder fallback
  for explicit_base_url without explicit_api_key

Updates 2 tests that expected the old (broken) behavior.
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
…fdcb4c4a

fix(cli): allow custom/local endpoints without API key
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
…auth (NousResearch#3842)

Local inference servers (Ollama, llama.cpp, vLLM, LM Studio) don't
require API keys, but the auxiliary client's _resolve_custom_runtime()
rejected endpoints with empty keys — causing the auto-detection chain
to skip the user's local server entirely.  This broke compression,
summarization, and memory flush for users running local models without
an OpenRouter/cloud API key.

The main CLI already had this fix (PR NousResearch#2556, 'no-key-required'
placeholder), but the auxiliary client's resolution path was missed.

Two fixes:
- _resolve_custom_runtime(): use 'no-key-required' placeholder instead
  of returning None when base_url is present but key is empty
- resolve_provider_client() custom branch: same placeholder fallback
  for explicit_base_url without explicit_api_key

Updates 2 tests that expected the old (broken) behavior.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants