doctor: validate HERMES_DEFAULT_PROVIDER + HERMES_INFERENCE_PROVIDER env vars by PowerCreek · Pull Request #95 · TechDevGroup/hermes-agent

PowerCreek · 2026-05-25T03:51:55Z

Summary

Synergy / self-audit ship: closes a validation gap I left in PR #91 (the HERMES_DEFAULT_PROVIDER env var). A typo like devagentic-locol previously fell through to the auto floor silently — operators discovered the problem mid-session by debugging confusing downstream errors. This adds a boot-time check.

Behavior

Silent when neither env var is set (silent-when-irrelevant pattern from the G2/G3/G4 MCP tools missing from mcp_serve.list_tools() despite create_mcp_server() registration #88/doctor: ACP installation probe (silent on success, fail+issue on missing) #53/doctor: extend tool-unavailable hint with web toolset #54 doctor probes).
check_ok when the env var matches a known provider name — surfaces deployment pins so operators can confirm they're live and correct.
check_warn with a sample of valid names when the env var doesn't match. Diagnosed-not-blocked: a custom provider name set outside the registry's view is still legal; just noisy.

Known-provider set is the union of:

providers.list_providers() — plugin-registered (devagentic-local, etc.)
hermes_cli.auth.PROVIDER_REGISTRY — built-in
Standard aliases (openrouter / custom / auto / anthropic / openai)

Both env vars are checked independently — a partial mismatch (one valid, one typo) surfaces precisely.

Test plan

8 new tests in tests/hermes_cli/test_doctor_provider_env_probe.py: silent-when-unset / silent-when-empty / known-name-ok / typo-warn-with-sample / both-checked-independently / case-insensitive / plugin-registered-name-known / providers-import-failure-doesnt-crash
96 total green across affected suites (probe + doctor + acp probe + resolver). No regression.
After merge: HERMES_DEFAULT_PROVIDER=typo hermes doctor shows the warn row

Composition

Pairs with PR feat(provider): HERMES_DEFAULT_PROVIDER env var beats config (closes #70) #91 (HERMES_DEFAULT_PROVIDER env var) — closes the validation gap I left in that ship.
Follows the silent-when-irrelevant probe pattern from PR doctor: ACP installation probe (silent on success, fail+issue on missing) #53 (_check_acp_installation) and the diagnostic-loudness wave that established it.

…env vars Surfaces typos in either env var at ``hermes doctor`` time instead of letting the worker silently fall through to ``auto`` mid-session. Closes a follow-up gap from #70 — the env var I shipped in PR #91 had no boot-time validation, so a typo like ``devagentic-locol`` would silently fail downstream in resolve_requested_provider's auto-detect fallthrough. ## Behavior - **Silent** when neither env var is set — most operators don't pin the provider via env, no row each run (silent-when-irrelevant pattern from the #88/#53 doctor probes). - **check_ok** when the env var matches a known provider name — surfaces deployment pins so operators can confirm they're live. - **check_warn** with a sample of valid names when the env var doesn't match. Diagnosed-not-blocked: a custom provider name set outside the registry's view is still legal, just noisy. Known-provider set is the union of: - ``providers.list_providers()`` (plugin-registered, e.g. devagentic-local) - ``hermes_cli.auth.PROVIDER_REGISTRY`` (built-in) - standard aliases (``openrouter`` / ``custom`` / ``auto`` / ``anthropic`` / ``openai``) Both env vars are checked independently — a partial mismatch (one valid, one typo) surfaces precisely. ## Tests - 8 new tests in tests/hermes_cli/test_doctor_provider_env_probe.py: silent-when-unset / silent-when-empty / known-name-ok / typo-warn- with-sample / both-env-vars-checked-independently / case-insens / plugin-registered-name-known / providers-import-failure-doesnt- crash. - 96 total green across affected suites (probe + doctor + acp probe + resolver). No regression. ## Composition - Pairs with PR #91 (HERMES_DEFAULT_PROVIDER env var) — closes the validation gap I left in that ship. - Follows the silent-when-irrelevant probe pattern from PR #53 (_check_acp_installation) and the diagnostic-loudness wave that established it.

…vances #89 Direction A) (#98) Operator-supplied intent override (Option A3 from #97). When ``HERMES_INTENT_OVERRIDE=code`` is set, the system prompt's ``stable`` layer narrows for tool-call-heavy traffic — addresses #89's prompt-saturation symptom on mid-tier coding models. ## What narrows under code intent | Block | Action | Why | |---|---|---| | SOUL.md | Skip | Largest single contributor; falls back to short DEFAULT_AGENT_IDENTITY floor | | HERMES_AGENT_HELP_GUIDANCE | Skip | Off-topic for tool-call traffic | | SKILLS_GUIDANCE | Skip | Per-tool block, off-topic for code | | KANBAN_GUIDANCE | Skip | Worker-lifecycle, off-topic for code | | SESSION_SEARCH_GUIDANCE | Skip | Off-topic for code | | skills_prompt (the big one) | Skip | Biggest contributor when many skills loaded | | MEMORY_GUIDANCE | **Keep** | Small + sometimes useful even for code | | TOOL_USE_ENFORCEMENT_GUIDANCE | **Keep** | Critical for tool emission | | Per-model operational guidance | **Keep** | Model-quality-specific | | Env / platform hints | **Keep** | Execution-environment essentials | | nous-subscription + computer-use + alibaba | **Keep** | Operational invariants | | ``context`` + ``volatile`` layers | **Untouched** | Out of scope per #97 | Other intents (``confer`` / ``planning`` / ``exploration`` / ``refinement`` / ``generic``) are recognized as valid but pass through without narrowing in v1 (keeps the door open for per- intent shape later). ## Intent vocabulary Matches devagentic#240's ``intent_classifier`` 6-key enum exactly, so the same operator-side classifier that's wired into devagentic's R5 dispatch hook can also drive hermes-side prompt narrowing without a second vocabulary. ## Doctor probe New ``_check_intent_override_env`` probe surfaces the active override at ``hermes doctor`` time — silent when unset, check_ok when valid (with a narrowing-active note for ``code``), check_warn with the full valid-keys list when typo'd. Mirrors the silent- when-irrelevant pattern from PR #95 / #96. ## Tests - 22 new prompt-narrowing tests in ``tests/agent/test_system_prompt_intent_override.py``: resolver enum + normalization (5), per-section drops under code (7), pass-through for non-code intents (5), typo falls back (1), byte-count regression (1), default-still-includes counter-case (1), case-insensitive (1), runtime-vs-doctor-config sanity (1). - 6 new doctor-probe tests in ``tests/hermes_cli/test_doctor_intent_override_probe.py``: silent-when-unset / silent-when-empty / code-ok-with-narrowing-note / non-code-valid-pass-through / typo-warn-with-valid-sample / case-insensitive. - 258 total green across affected suites (system-prompt + prompt- builder + restore + doctor + provider-env + tools-subset). No regression in the existing prompt-shape pins. ## Composition note Option A1 (port classifier) + A2 (devagentic GraphQL surface) are deferred per the #97 sequencing — A3 unblocks deployment-specific narrowing immediately; A1/A2 only matter when dynamic per-turn classification is needed on the hermes side. The classifier output on the devagentic side (NousResearch#240) drives R5 dispatch decisions there.

#115) (#116) Companion to devagentic#315 (initiative preamble). When operator sets ``HERMES_TOOL_USE_ENFORCEMENT=required``, the chat_completions transport injects ``tool_choice: "required"`` on every dispatch where tools are attached — the model-layer enforcement that closes the gap devagentic#315's soft-signal preamble leaves open. ## Behavior - Unset / empty / unknown value → default behavior unchanged (no ``tool_choice`` injected by hermes) - ``HERMES_TOOL_USE_ENFORCEMENT=required`` + tools attached → ``tool_choice: "required"`` set on the API kwargs - Tools NOT attached → no injection (sending ``tool_choice=required`` with empty tools is a 400 on most providers) - Caller-supplied ``tool_choice`` already on kwargs → no override (the dispatcher-tier signal wins; env is a session-tier default) Per devagentic#203 §1.3 — hermes owns model-call-shape decisions (per-call enforcement). Devagentic's models.json ``default_tool_choice`` is the dispatcher-tier default; this env is the session-tier override. ## Where it fires Both build_kwargs paths in ``chat_completions.py``: - Legacy fallback path (unregistered providers) - Provider-profile path (known providers via providers/ registry) Shared helper ``_maybe_inject_required_tool_choice(api_kwargs, tools)`` keeps the two sites in sync. ## Doctor probe New ``_check_tool_use_enforcement_env`` surfaces the active setting — silent when unset, ``check_ok`` on ``required``, ``check_warn`` with valid-values hint on typos. Mirrors the silent-when-irrelevant pattern from #95 / #96 / persona-deferred. ## Tests - 18 new tests in tests/agent/test_tool_use_enforcement.py: resolver returns None/required/case-insensitive/unknown (8 parametrized), injection happy path (1), no-inject-when-unset (1), no-inject-when-no-tools (1 covering both None and empty list), does-not-clobber-existing-tool_choice (1), no-inject-on-unknown (1), doctor silent-when-unset (1), doctor check_ok on required (1), doctor check_warn on unknown (1). - 128 total green across affected suites (new + doctor + provider/ intent/persona/tools-subset probes). No regression. ## Sequencing per #115 body The issue says "Land after devagentic#315 Phase 1 has deployed + been observed. If the preamble alone closes the reliability gap to operator satisfaction, this issue may not need to ship." This PR ships the env-knob in opt-in OFF-by-default mode, so: - Operators can enable it the moment they observe NousResearch#315's preamble is insufficient (no further hermes-side dev cycle needed) - Default behavior unchanged → zero risk to non-client-tier sessions - Doctor probe surfaces the active state so operators can confirm enablement at boot Saves the round-trip of waiting + then dev'ing once the signal arrives.

After v0.18.4's tool_call recovery (#124) landed, the next-level bug surfaced in sandbox field-test: model calls a tool name hermes' worker didn't register, the invalid_tool_call retry path fires, but its verbose-only print is invisible in default runs. Combined with model hallucination ("the file has been created..." narration on the NEXT turn), the mismatch becomes invisible — operators see model narration, not the underlying tool-name mismatch. ## Fix Upgrade conversation_loop.py:3219's verbose-only print to: 1. ``logger.warning`` with the invented name + count + first 10 registered names + model + provider for cross-system log correlation 2. ``agent._emit_status`` surfacing the mismatch in the user- facing stream Operator immediately sees: - WHICH name the model invented - HOW MANY tools the worker has registered - WHICH tools (sample) ARE registered - Across which retry of 3 No behavior change — existing invalid_tool_call retry semantics unchanged. Pure observability boost. ## Tests - 3 new source-level tests in tests/agent/test_loud_invalid_tool_call.py: patch-landed, emit_status template includes name + count, warning includes model + provider for correlation. - 20 total green across affected suites — no regression. ## Composition Same observability family as the #95 / #96 doctor probes. Helps operators distinguish "hermes ate the tool_call" from "sandbox toolset doesn't expose what the model is calling".

PowerCreek merged commit 9a9c25d into main May 25, 2026

PowerCreek deleted the doctor-provider-env-probe branch May 25, 2026 03:52

This was referenced May 26, 2026

provider: 'Unknown provider devagentic-local' despite bundled plugin on disk #103

Closed

fix(provider): accept plugin-registered profile names (closes #103) #104

Merged

PowerCreek mentioned this pull request May 27, 2026

diagnose: model calls write_file but tool not registered — narration covers up the mismatch #127

Closed

PowerCreek mentioned this pull request May 27, 2026

diagnose: tool_calls dispatched but side-effect doesn't fire — need entry/return WARNs in tool_executor #130

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

doctor: validate HERMES_DEFAULT_PROVIDER + HERMES_INFERENCE_PROVIDER env vars#95

doctor: validate HERMES_DEFAULT_PROVIDER + HERMES_INFERENCE_PROVIDER env vars#95
PowerCreek merged 1 commit into
mainfrom
doctor-provider-env-probe

PowerCreek commented May 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

PowerCreek commented May 25, 2026

Summary

Behavior

Test plan

Composition

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant