Bug: custom provider config is ignored and gateway does not hit custom base URL
Version: v2026.4.3 (pinned tag)
Install: pip install -e ".[all]"
Runtime: Docker (python:3.11-slim-bookworm)
Summary
When Hermes is configured to use a custom OpenAI-compatible endpoint (http://localhost:4000 via LiteLLM), gateway chat requests do not reach that endpoint. I can reproduce this with (1) CLI config values, (2) direct config file values, and (3) HERMES_INFERENCE_PROVIDER=custom.
Minimal repro
- Start LiteLLM on
localhost:4000 with request logging enabled.
- In the same container/session:
whoami
echo "$HOME"
hermes config set model.provider custom
hermes config set model.api_base http://localhost:4000
hermes config set model.api_key my-key
hermes config set model.default default
hermes config set model.name default
cat ~/.hermes/config.yaml
hermes config get model.provider
export HERMES_INFERENCE_PROVIDER=custom
export OPENAI_BASE_URL=http://localhost:4000
export OPENAI_API_KEY=my-key
hermes gateway run
- In another terminal, send one request:
curl -sS -X POST http://localhost:8642/v1/chat/completions \
-H "Authorization: Bearer test-admin-key" \
-H "Content-Type: application/json" \
-d '{"model":"default","messages":[{"role":"user","content":"hi"}]}'
Actual behavior
hermes config set ... reports success, and ~/.hermes/config.yaml contains the expected model.provider: custom and model.api_base: http://localhost:4000.
hermes config get model.provider returns empty in this state.
- Gateway requests fail with a 401 (
Missing Authentication header) and LiteLLM receives zero incoming completion requests.
Gateway log excerpt:
[tool] (°ロ°) mulling...
[done] (╥_╥) error, retrying... (1.0s)
ERROR root: Non-retryable client error: Error code: 401 - {'error': {'message': 'Missing Authentication header', 'code': 401}}
Expected behavior
With model.provider=custom and model.api_base=http://localhost:4000, gateway requests should be sent to the configured custom endpoint (/v1/chat/completions) with the configured API key.
Notes
- I can work around this by bypassing Hermes gateway and calling LiteLLM directly.
- Happy to run additional debug logging if you can point me to the best place to inspect resolved runtime provider selection.
Bug:
customprovider config is ignored and gateway does not hit custom base URLVersion:
v2026.4.3(pinned tag)Install:
pip install -e ".[all]"Runtime: Docker (
python:3.11-slim-bookworm)Summary
When Hermes is configured to use a custom OpenAI-compatible endpoint (
http://localhost:4000via LiteLLM), gateway chat requests do not reach that endpoint. I can reproduce this with (1) CLI config values, (2) direct config file values, and (3)HERMES_INFERENCE_PROVIDER=custom.Minimal repro
localhost:4000with request logging enabled.Actual behavior
hermes config set ...reports success, and~/.hermes/config.yamlcontains the expectedmodel.provider: customandmodel.api_base: http://localhost:4000.hermes config get model.providerreturns empty in this state.Missing Authentication header) and LiteLLM receives zero incoming completion requests.Gateway log excerpt:
Expected behavior
With
model.provider=customandmodel.api_base=http://localhost:4000, gateway requests should be sent to the configured custom endpoint (/v1/chat/completions) with the configured API key.Notes