Skip to content

Self-hosted Honcho with OpenAI-compatible providers (OpenRouter / vLLM): default AsyncOpenAI client lacks base_url → 401 against api.openai.com #641

@augustin-ship-it

Description

@augustin-ship-it

Summary

When deploying self-hosted Honcho with an OpenAI-compatible provider (OpenRouter, vLLM, Together, Anyscale, etc.) — i.e. setting LLM_OPENAI_API_KEY to a non-OpenAI key — every dialectic / deriver / summary call fails with:

openai.AuthenticationError: Error code: 401 - {'error': {'message': 'Incorrect API key provided: sk-…',
  'type': 'invalid_request_error', 'code': 'invalid_api_key'}, 'status': 401}

The SDK error mentions platform.openai.com because the client is hitting https://api.openai.com/v1 instead of the configured provider.

There is no warning that config.toml placed PROVIDER / MODEL at the wrong nesting level either — the operator silently runs against the hardcoded default model.

Repro

.env

LLM_OPENAI_API_KEY=sk-or-v1-…              # OpenRouter key
LLM_VLLM_API_KEY=sk-or-v1-…
LLM_VLLM_BASE_URL=https://openrouter.ai/api/v1
LLM_EMBEDDING_API_KEY=sk-or-v1-…
LLM_EMBEDDING_BASE_URL=https://openrouter.ai/api/v1

config.toml excerpt (legacy flat shape, common in the wild):

[dialectic.levels.minimal]
PROVIDER = "vllm"
MODEL = "z-ai/glm-4.7-flash"

Then call peer.chat() / dialectic.chat(). Result: tenacity.RetryError wrapping AuthenticationError. After retry_attempts=3, the 401 surfaces to clients.

Root cause #1 — default OpenAI client lacks base_url

src/llm/registry.py:98

if settings.LLM.OPENAI_API_KEY:
    CLIENTS["openai"] = AsyncOpenAI(
        api_key=settings.LLM.OPENAI_API_KEY,
        # no base_url → SDK defaults to https://api.openai.com/v1
    )

LLMSettings (src/config.py:643) has OPENAI_API_KEY but no OPENAI_BASE_URL field, so any operator value is silently ignored.

Root cause #2 — flat PROVIDER/MODEL in config.toml silently fall back to default model

DialecticLevelSettings expects nested:

[dialectic.levels.minimal.model_config]
transport = "openai"
model = "z-ai/glm-4.7-flash"

But config.toml.example and most copy-pasted configs in the wild use the flat shape. When the parsed model_config is missing, Honcho falls back to _default_dialectic_levels() which hardcodes model="gpt-5.4-mini" — a model that doesn't exist on any provider, so OpenRouter responds 401 User not found.. No warning is logged.

Workaround

  1. Add OPENAI_BASE_URL=https://openrouter.ai/api/v1 to .env (the OpenAI Python SDK reads this env var on AsyncOpenAI() instantiation).
  2. Reformat config.toml: replace each PROVIDER/MODEL pair with a nested [<section>.model_config] block (transport=, model=).
  3. Note: config.toml is baked into the Docker imagedocker compose restart won't pick up host-side edits. Use docker cp or mount it as a volume in docker-compose.yml.

Proposed fix

  1. Add OPENAI_BASE_URL to LLMSettings and pass it to AsyncOpenAI(base_url=…) in registry.py. Same for EmbeddingSettings.
  2. Validate config.toml shape on startup: if a [dialectic.levels.*] (or [deriver], [summary], [dream]) section contains PROVIDER/MODEL at top level (instead of nested model_config), log a WARNING and either auto-migrate or refuse to start.
  3. Mount config.toml as a volume in docker-compose.yml.example so operator edits survive restarts.

Environment

  • Honcho: latest main as of May 2026
  • Deploy: docker compose self-hosted
  • Provider: OpenRouter (also reproduces with vLLM)
  • SDK: openai Python

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions