Skip to content

feat(provider): HERMES_DEFAULT_PROVIDER env var beats config (closes #70)#91

Merged
PowerCreek merged 1 commit into
mainfrom
hermes-default-provider-70
May 25, 2026
Merged

feat(provider): HERMES_DEFAULT_PROVIDER env var beats config (closes #70)#91
PowerCreek merged 1 commit into
mainfrom
hermes-default-provider-70

Conversation

@PowerCreek

Copy link
Copy Markdown

Summary

Closes #70 (option C from the issue body).

The provider auto-detection regression in #70 falls into two cases:

  1. Model auto-detectdetect_provider_for_model doesn't see devagentic-local's fetched catalog because it's populated lazily by DevagenticLocalProfile.fetch_models(), not in the static _PROVIDER_MODELS table. So bare mistral-large routes to OpenRouter (Step 2 slug-match) instead of devagentic-local. Option B (populate static catalog from fetched models on boot) is the clean fix but depends on devagentic#216 being deployed canonically — out of scope for this PR.

  2. Container deployments — when env should win over the persisted config so a baked image can be re-pinned without rewriting config.yaml. This PR fixes case (2).

What changed

New env var HERMES_DEFAULT_PROVIDER slots between the explicit --provider CLI flag and the persisted model.provider config field. Operators can pin the provider via env without rewriting the baked-in config.

Priority on resolve_requested_provider is now:

  1. explicit requested argument (CLI --provider flag)
  2. HERMES_DEFAULT_PROVIDER env (new — beats config)
  3. model.provider in persisted config.yaml
  4. HERMES_INFERENCE_PROVIDER env (legacy — stays below config so a stale shell export doesn't shadow the user's last-saved interactive selection)
  5. "auto" floor

The existing HERMES_INFERENCE_PROVIDER semantics are deliberately unchanged — we add the new env above it. Users on the legacy path keep their behavior.

Deployment recipe

ENV HERMES_DEFAULT_PROVIDER=devagentic-local
ENV DEVAGENTIC_BASE_URL=http://devbox:6071/v1
ENV DEVAGENTIC_API_KEY=...

Container boot now pins provider=devagentic-local at session start; subsequent /model mistral-large resolves via the current-provider short-circuit in detect_provider_for_model.

Test plan

  • 12 new priority-order tests in tests/hermes_cli/test_resolve_requested_provider.py — pass
  • 122 existing resolver tests in test_runtime_provider_resolution.py + test_custom_provider_model_switch.py — pass (no regression)
  • After merge + container rebuild + ENV set: confirm hermes selects devagentic-local at boot without --provider flag

Composition

  • Option B (catalog populate on boot) deferred until devagentic#216 (/v1/models bearer-only fix) is canonically deployed. Will file as follow-up once that endpoint stabilizes.
  • v0.16.0 cascade unaffected — this is a pure provider-resolution change, not a tool-surface change.

)

The provider auto-detection regression in #70 falls into two
cases — model auto-detect (devagentic-local catalog fetched lazily,
not visible to detect_provider_for_model until a session warm-up
has happened) and container deployments (env should win over the
persisted config so a baked image can be re-pinned without rewriting
config.yaml).

This PR addresses the container-deployment case (option C in the
issue body). New env var ``HERMES_DEFAULT_PROVIDER`` slots between
the explicit ``--provider`` CLI flag and the persisted
``model.provider`` config field — the deployment-priority env knob
operators have been asking for.

Priority order on ``resolve_requested_provider`` is now:
  1. explicit ``requested`` argument (CLI ``--provider``)
  2. ``HERMES_DEFAULT_PROVIDER`` env (new — beats config)
  3. ``model.provider`` in persisted config.yaml
  4. ``HERMES_INFERENCE_PROVIDER`` env (legacy — stays below config
     so a stale shell export does not shadow the user's last-saved
     interactive selection)
  5. ``"auto"`` floor

Existing HERMES_INFERENCE_PROVIDER semantics unchanged — only adds
the new var above it. 122 existing resolver tests pass alongside
12 new priority-order tests.

The model auto-detect (option B: populate static catalog from
fetched models on boot) is a larger ship that depends on
devagentic#216 being deployed canonically. Filed as a follow-up
once the devagentic-side fetch_models() endpoint stabilizes.
@PowerCreek PowerCreek merged commit ef58b0f into main May 25, 2026
@PowerCreek PowerCreek deleted the hermes-default-provider-70 branch May 25, 2026 01:33
PowerCreek added a commit that referenced this pull request May 25, 2026
…env vars (#95)

Surfaces typos in either env var at ``hermes doctor`` time instead
of letting the worker silently fall through to ``auto`` mid-session.
Closes a follow-up gap from #70 — the env var I shipped in PR #91
had no boot-time validation, so a typo like ``devagentic-locol``
would silently fail downstream in resolve_requested_provider's
auto-detect fallthrough.

## Behavior

- **Silent** when neither env var is set — most operators don't pin
  the provider via env, no row each run (silent-when-irrelevant
  pattern from the #88/#53 doctor probes).
- **check_ok** when the env var matches a known provider name —
  surfaces deployment pins so operators can confirm they're live.
- **check_warn** with a sample of valid names when the env var
  doesn't match. Diagnosed-not-blocked: a custom provider name set
  outside the registry's view is still legal, just noisy.

Known-provider set is the union of:
- ``providers.list_providers()`` (plugin-registered, e.g.
  devagentic-local)
- ``hermes_cli.auth.PROVIDER_REGISTRY`` (built-in)
- standard aliases (``openrouter`` / ``custom`` / ``auto`` /
  ``anthropic`` / ``openai``)

Both env vars are checked independently — a partial mismatch (one
valid, one typo) surfaces precisely.

## Tests

- 8 new tests in tests/hermes_cli/test_doctor_provider_env_probe.py:
  silent-when-unset / silent-when-empty / known-name-ok / typo-warn-
  with-sample / both-env-vars-checked-independently / case-insens
  / plugin-registered-name-known / providers-import-failure-doesnt-
  crash.
- 96 total green across affected suites (probe + doctor + acp probe
  + resolver). No regression.

## Composition

- Pairs with PR #91 (HERMES_DEFAULT_PROVIDER env var) — closes the
  validation gap I left in that ship.
- Follows the silent-when-irrelevant probe pattern from PR #53
  (_check_acp_installation) and the diagnostic-loudness wave that
  established it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Provider auto-detection regression: bare 'mistral-large' routes to openrouter instead of devagentic-local despite env wired

1 participant