feat(provider): HERMES_DEFAULT_PROVIDER env var beats config (closes #70)#91
Merged
Conversation
) The provider auto-detection regression in #70 falls into two cases — model auto-detect (devagentic-local catalog fetched lazily, not visible to detect_provider_for_model until a session warm-up has happened) and container deployments (env should win over the persisted config so a baked image can be re-pinned without rewriting config.yaml). This PR addresses the container-deployment case (option C in the issue body). New env var ``HERMES_DEFAULT_PROVIDER`` slots between the explicit ``--provider`` CLI flag and the persisted ``model.provider`` config field — the deployment-priority env knob operators have been asking for. Priority order on ``resolve_requested_provider`` is now: 1. explicit ``requested`` argument (CLI ``--provider``) 2. ``HERMES_DEFAULT_PROVIDER`` env (new — beats config) 3. ``model.provider`` in persisted config.yaml 4. ``HERMES_INFERENCE_PROVIDER`` env (legacy — stays below config so a stale shell export does not shadow the user's last-saved interactive selection) 5. ``"auto"`` floor Existing HERMES_INFERENCE_PROVIDER semantics unchanged — only adds the new var above it. 122 existing resolver tests pass alongside 12 new priority-order tests. The model auto-detect (option B: populate static catalog from fetched models on boot) is a larger ship that depends on devagentic#216 being deployed canonically. Filed as a follow-up once the devagentic-side fetch_models() endpoint stabilizes.
3 tasks
PowerCreek
added a commit
that referenced
this pull request
May 25, 2026
…env vars (#95) Surfaces typos in either env var at ``hermes doctor`` time instead of letting the worker silently fall through to ``auto`` mid-session. Closes a follow-up gap from #70 — the env var I shipped in PR #91 had no boot-time validation, so a typo like ``devagentic-locol`` would silently fail downstream in resolve_requested_provider's auto-detect fallthrough. ## Behavior - **Silent** when neither env var is set — most operators don't pin the provider via env, no row each run (silent-when-irrelevant pattern from the #88/#53 doctor probes). - **check_ok** when the env var matches a known provider name — surfaces deployment pins so operators can confirm they're live. - **check_warn** with a sample of valid names when the env var doesn't match. Diagnosed-not-blocked: a custom provider name set outside the registry's view is still legal, just noisy. Known-provider set is the union of: - ``providers.list_providers()`` (plugin-registered, e.g. devagentic-local) - ``hermes_cli.auth.PROVIDER_REGISTRY`` (built-in) - standard aliases (``openrouter`` / ``custom`` / ``auto`` / ``anthropic`` / ``openai``) Both env vars are checked independently — a partial mismatch (one valid, one typo) surfaces precisely. ## Tests - 8 new tests in tests/hermes_cli/test_doctor_provider_env_probe.py: silent-when-unset / silent-when-empty / known-name-ok / typo-warn- with-sample / both-env-vars-checked-independently / case-insens / plugin-registered-name-known / providers-import-failure-doesnt- crash. - 96 total green across affected suites (probe + doctor + acp probe + resolver). No regression. ## Composition - Pairs with PR #91 (HERMES_DEFAULT_PROVIDER env var) — closes the validation gap I left in that ship. - Follows the silent-when-irrelevant probe pattern from PR #53 (_check_acp_installation) and the diagnostic-loudness wave that established it.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #70 (option C from the issue body).
The provider auto-detection regression in #70 falls into two cases:
Model auto-detect —
detect_provider_for_modeldoesn't see devagentic-local's fetched catalog because it's populated lazily byDevagenticLocalProfile.fetch_models(), not in the static_PROVIDER_MODELStable. So baremistral-largeroutes to OpenRouter (Step 2 slug-match) instead of devagentic-local. Option B (populate static catalog from fetched models on boot) is the clean fix but depends on devagentic#216 being deployed canonically — out of scope for this PR.Container deployments — when env should win over the persisted config so a baked image can be re-pinned without rewriting
config.yaml. This PR fixes case (2).What changed
New env var
HERMES_DEFAULT_PROVIDERslots between the explicit--providerCLI flag and the persistedmodel.providerconfig field. Operators can pin the provider via env without rewriting the baked-in config.Priority on
resolve_requested_provideris now:requestedargument (CLI--providerflag)HERMES_DEFAULT_PROVIDERenv (new — beats config)model.providerin persistedconfig.yamlHERMES_INFERENCE_PROVIDERenv (legacy — stays below config so a stale shell export doesn't shadow the user's last-saved interactive selection)"auto"floorThe existing
HERMES_INFERENCE_PROVIDERsemantics are deliberately unchanged — we add the new env above it. Users on the legacy path keep their behavior.Deployment recipe
Container boot now pins provider=devagentic-local at session start; subsequent
/model mistral-largeresolves via the current-provider short-circuit indetect_provider_for_model.Test plan
tests/hermes_cli/test_resolve_requested_provider.py— passtest_runtime_provider_resolution.py+test_custom_provider_model_switch.py— pass (no regression)hermesselectsdevagentic-localat boot without--providerflagComposition
/v1/modelsbearer-only fix) is canonically deployed. Will file as follow-up once that endpoint stabilizes.