Skip to content

fix(setup+auxiliary): custom endpoint loses config after setup, and auxiliary ignores active provider#3442

Closed
Mibayy wants to merge 3 commits into
NousResearch:mainfrom
Mibayy:fix/auxiliary-3410-clean
Closed

fix(setup+auxiliary): custom endpoint loses config after setup, and auxiliary ignores active provider#3442
Mibayy wants to merge 3 commits into
NousResearch:mainfrom
Mibayy:fix/auxiliary-3410-clean

Conversation

@Mibayy

@Mibayy Mibayy commented Mar 27, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes #3410.

Two separate bugs that both affect custom OpenAI-compatible endpoint users.


Fix 1 — auxiliary_client: mirror active runtime in _resolve_custom_runtime

Root cause

_resolve_custom_runtime() hardcoded requested="custom" when calling
resolve_runtime_provider(). The problem: _get_named_custom_provider()
explicitly returns None when requested_norm == "custom" — it only matches
named entries like custom:local. So named custom providers saved in
config.yaml were silently bypassed.

The auxiliary path then fell through to env-var resolution
(OPENAI_BASE_URL / OPENAI_API_KEY), which can carry different credentials
or nothing at all — even though the main agent was already running successfully
against the named provider.

Fix

resolve_requested_provider()          # what provider is the main agent using?
  → "custom:local"                    # reads config.yaml / HERMES_INFERENCE_PROVIDER

resolve_runtime_provider(requested="custom:local")   # same path as main agent
  → { provider: "custom", source: "custom_provider:local", api_key: "...", ... }

If the active provider is not a custom endpoint (e.g. nous, openrouter),
fall back to requested="custom" so env-driven OPENAI_BASE_URL setups keep working.

Also adds DEBUG-level logging of the credential source selected for auxiliary
resolution.


Fix 2 — setup: reload config after _model_flow_custom to prevent clobber

Root cause

_model_flow_custom() writes model.provider and model.base_url to disk via
its own independent load_config() + save_config() cycle — it does not
mutate the outer config dict that run_setup_wizard() holds.

After _model_flow_custom returns, run_setup_wizard() calls save_config(config)
with the stale outer dict (no provider, no base_url), silently overwriting
the correct values.

Net effect: .env is right, config.yaml is wrong → the user must run
hermes model again to recover, and vision / auxiliary tooling also fails.

Fix

After _model_flow_custom() returns, reload the on-disk state back into the
outer config dict so the subsequent save_config(config) is a no-op:

config.clear()
config.update(load_config())

Changes

  • agent/auxiliary_client.py_resolve_custom_runtime(): use resolve_requested_provider() first
  • hermes_cli/setup.pyrun_setup_wizard(): reload config after custom flow
  • tests/agent/test_auxiliary_client.py — 5 new tests + isolation fix
  • tests/hermes_cli/test_setup_model_provider.py — regression test for config clobber

Tests

103 passed in test_auxiliary_client.py
1 passed  in test_setup_model_provider.py::test_custom_setup_wizard_persists_provider_in_config

Mibayy added 3 commits March 27, 2026 21:07
When the main agent uses a named custom provider (e.g. custom:local saved
in config.yaml), _resolve_custom_runtime() was bypassing it entirely.
The old code hardcoded requested="custom", which _get_named_custom_provider
explicitly rejects - it returns None when requested_norm == "custom".
The auxiliary path then fell back to env-var resolution (OPENAI_BASE_URL),
which can carry different or missing credentials.

Fix:
- Call resolve_requested_provider() first to get the same provider the
  main agent is using (reads config.yaml / HERMES_INFERENCE_PROVIDER).
- Resolve with that active provider name so named custom providers are
  honoured via _get_named_custom_provider.
- Fall back to requested="custom" only when the active provider is not
  itself a custom endpoint (e.g. nous, openrouter), preserving existing
  behaviour for env-driven OPENAI_BASE_URL setups.
- Add debug logging of which credential source was selected (no secrets).

Fixes: NousResearch#3410
…ropic_client

test_vision_auto_falls_back_to_custom_endpoint set OPENAI_BASE_URL to test
the custom-endpoint fallback path, but the vision auto chain tries anthropic
before custom. Without mocking build_anthropic_client, the test hit the real
SDK which fails with 'proxies=' on newer anthropic SDK versions.

Fix: patch build_anthropic_client to None so anthropic returns (None, None),
letting the chain reach the custom endpoint as intended. Also set
HERMES_INFERENCE_PROVIDER=custom so _resolve_custom_runtime() mirrors the
env-driven setup without falling through to unrelated provider paths.
_model_flow_custom() writes model.provider and model.base_url to disk
via its own fresh load_config() + save_config() cycle, but the outer
 dict passed in by run_setup_wizard() is never updated in-place.

The final save_config(config) at the bottom of run_setup_wizard() then
overwrites the file with the stale outer dict, silently stripping
model.provider and model.base_url from config.yaml.

Net effect: after 'hermes setup' with a custom endpoint, .env has the
right OPENAI_BASE_URL / OPENAI_API_KEY but config.yaml has no provider
or base_url — the user must run 'hermes model' again to fix it.

Fix: after _model_flow_custom() returns, reload the on-disk state back
into the outer config dict so the subsequent save_config(config) is
a no-op rather than a destructive overwrite.

Fixes NousResearch#3415.

Tests: test_custom_setup_wizard_persists_provider_in_config
@Mibayy Mibayy changed the title fix(auxiliary): mirror active runtime in _resolve_custom_runtime fix(setup+auxiliary): custom endpoint loses config after setup, and auxiliary ignores active provider Mar 27, 2026
@teknium1

teknium1 commented Apr 8, 2026

Copy link
Copy Markdown
Contributor

Both issues in this PR are now resolved on main:

  1. _resolve_custom_runtime() bypassed named custom providers — Fixed in PR fix(auxiliary): resolve named custom providers and 'main' alias in auxiliary routing #5978. Named custom providers are now resolved directly in resolve_provider_client() before _try_custom_endpoint() is reached, so the hardcoded requested="custom" is no longer in the hot path.

  2. Setup wizard config clobber — Already fixed on main (the _refreshed = load_config() re-sync after select_provider_and_model(), lines 918-925 in setup.py).

Thanks for the thorough analysis of both issues, @Mibayy!

@teknium1 teknium1 closed this Apr 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Auxiliary auto/custom resolution can diverge from the active custom runtime

2 participants