Skip to content

[Bug] Provider auto-detection ignores config.yaml — stale API keys override explicit custom endpoint #4171

@SHL0MS

Description

@SHL0MS

Problem

Users who set their base model to a local LLM (e.g., Qwen 3.5 35B) are surprised to see OpenAI/Codex API calls burning tokens. The cause: the auxiliary model system for background tasks (compression, vision, smart approval, memory flush, session search) auto-resolves independently of the base model.

The auto-detection chain at `auxiliary_client.py:759`:
```
OpenRouter → Nous Portal → Custom endpoint → Codex → API key provider → None
```

If the user has any of these API keys set from a previous setup (`OPENROUTER_API_KEY`, `OPENAI_API_KEY`, Codex OAuth token), the auxiliary system silently uses them — even though the user explicitly configured a local model as their base.

User experience

  1. User runs `hermes setup`, selects local Qwen 3.5 35B as their model
  2. User also has `OPENROUTER_API_KEY` set in `.env` from a previous configuration
  3. User starts chatting — base model uses local Qwen correctly
  4. Context compression fires → silently routes to OpenRouter → burns tokens
  5. Vision preprocessing fires → silently routes to OpenRouter → burns more tokens
  6. User has no idea this is happening until they check their API billing

Suggested fixes

Minimal: Surface auxiliary routing in /usage

Add a line to `/usage` output showing which provider is being used for auxiliary tasks:
```
Auxiliary model: openrouter/google/gemini-3-flash (auto-detected)
```

Better: Warn on mismatch

If the base model is local (custom endpoint) but auxiliary auto-resolved to a paid provider, emit a one-time warning:
```
⚠️ Auxiliary tasks (compression, vision) are using OpenRouter.
To route everything through your local model, add to config.yaml:
auxiliary:
compression:
provider: custom
base_url: http://localhost:1234/v1
```

Best: `hermes setup` should ask

When the user selects a local model, setup should ask: "Route auxiliary tasks (compression, vision) through this model too? [y/N]" and configure `auxiliary.*` accordingly.

Technical details

  • Auxiliary routing: `agent/auxiliary_client.py:754-776` (`_resolve_auto`)
  • Task-specific config: `auxiliary.{task}.provider` and `auxiliary.{task}.model` in config.yaml
  • Tasks that use auxiliary: compression, vision, web_extract, session_search, skills_hub, mcp, flush_memories
  • The user CAN configure this manually, but setup doesn't guide them to it

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions