feat(env): HERMES_TOOLS_SUBSET — operator-side tool surface narrowing (closes #74)#75
Merged
Merged
Conversation
This was referenced May 24, 2026
PowerCreek
added a commit
that referenced
this pull request
May 24, 2026
…loses #86) (#87) Post-#82/#83/#85 the autowire surfaces ~36 MCP tools from hermes-internal, pushing fresh workers to 52 total tools and into the tool-paralysis ceiling (text responses with affirmation pattern, no tool_call emission for verticals that should be one-shot). #75 (HERMES_TOOLS_SUBSET) was supposed to let operators narrow the surface per worker, but it had three gaps for MCP tools: 1. Subset filtering only ran in agent_init.py:838 (built-in tool path) 2. cli.py:9790 (/reload-mcp + auto-reload on config change) re-assigns agent.tools without re-applying the filter — regression risk where any post-init MCP server reload nukes the subset 3. Even when filtering at agent.tools, the registry still carried the unwanted MCP tools, costing schema-conversion + collision-check work per tool per boot Fix: apply the allow-list at MCP tool registration in tools/mcp_tool.py::_register_server_tools. Excluded tools never enter the registry, so both initial discovery AND /reload-mcp paths honor the subset uniformly + the registry stays clean. Three edits: 1. hermes_cli/tool_subset.py (new) — shared helpers `get_subset_allow()` + `is_tool_allowed()`. Single source of truth for env parsing + allow-list semantics so the two call sites (agent_init + mcp_tool) can't drift in casing/whitespace/empty-vs-missing handling. 2. tools/mcp_tool.py::_register_server_tools — read the subset once per server registration (O(1) check per tool, not O(N) env parse), apply inside both the per-tool loop AND the utility-tools loop (list_resources/read_resource/list_prompts/get_prompt). 3. agent/agent_init.py — refactored the inline #75 filter to use the shared helper. Behavior unchanged; dedupes parsing logic. Subset compares against the **prefixed** MCP name (`mcp_<server>_<tool>`) — exact match, predictable behavior. Fuzzy / unprefixed matching is a separate feature request. Tests (tests/test_mcp_subset_filter.py — 14 cases, all pass): - get_subset_allow: unset/empty/whitespace-only/single/multi/padded/ MCP-prefixed parse cases - is_tool_allowed: None passes through, exact match, exclusion, no substring matching, MCP prefixed vs unprefixed (contract test) - Source-level: _register_server_tools imports the shared helper + calls it inside main loop - Source-level: same call inside the utility-tools loop (regression catcher for "fix the main loop, forget utility tools") - Source-level: agent_init.py uses shared helper, no stale inline parser 37 total green (14 new + 20 from #83 + 3 from #85). Final layer of the post-#67 cascade. After this lands + container rebuild, workers can be configured with HERMES_TOOLS_SUBSET to a 5-tool surface that includes their specific MCP needs (e.g. "doc_view,mcp_hermes-internal_grafted_context_fetch,..."), out of tool-paralysis range.
This was referenced May 24, 2026
PowerCreek
added a commit
that referenced
this pull request
May 25, 2026
#96) Operators who narrow the tool surface via HERMES_TOOLS_SUBSET can now confirm at ``hermes doctor`` time exactly which tools the filter parsed to. Catches two failure modes that previously required a separate ``hermes mcp list`` diff: 1. Operator typoed a tool name → still in the parsed list (no cross-check), but the diff against ``hermes mcp list`` is now trivial. 2. Operator forgot the ``mcp_<server>_<tool>`` prefix for MCP tools → no entry uses ``mcp_`` prefix but entries look structured → info reminder fires. Silent when env var is unset/empty (silent-when-irrelevant pattern from the #88/#53/#54 doctor probes). When set, surfaces: * check_ok with the count + a sample of names (first 6, then ``+N more`` suffix to keep the row readable); * check_info reminder when zero entries use the mcp_ prefix but some look structured (the most common parse-correctly-but- filter-nothing failure mode). Cross-check against the live MCP registry was considered + rejected for this PR — it would require spinning up ``create_mcp_server()`` at probe time. Operators can ``hermes mcp list`` separately if they want the full diff. Filing as opportunistic follow-up if demand shows up. ## Tests - 8 new tests in tests/hermes_cli/test_doctor_tools_subset_probe.py: silent-when-unset / silent-when-empty / silent-when-whitespace / count-and-sample-shown / long-list-truncated / mcp-prefix-reminder / no-reminder-when-mcp-present / no-reminder-when-only-simple-bare- names. - 30 total green across affected suites (probe + provider-env-probe + mcp_subset_filter). No regression.
PowerCreek
added a commit
that referenced
this pull request
May 28, 2026
…143) (#145) T3 of the #143 thin-client refactor scope. When the active provider is ``devagentic-local``, augment ``disabled_toolsets`` with ``"clarify"`` before the ``get_tool_definitions`` call. ## Rationale Per devagentic#203 §1.3 + the #143 scope: devagentic-side intent classifier knows when clarification is actually needed and can surface it via an OpenAI-shaped assistant message. Hermes' modal TUI clarify-tool was a layered opinion fighting devagentic-side classification — both rotation's debug evidence + sandbox UX showed the dual-source as confusing (clarify modal popped even with --yolo, ignoring devagentic-side intent signals). This is the smallest of the T1-T3 sequence and the cleanest revert path — purely a tool-registry adjustment for one provider. ## Behavior | Setting | Before | After | |---|---|---| | provider=devagentic-local, no --enable-toolset | clarify enabled | clarify implicitly disabled | | provider=devagentic-local, --enable-toolset clarify | clarify enabled | clarify enabled (explicit override) | | provider=other | unchanged | unchanged | A boot-line print informs operators of the implicit disable + how to re-enable for legacy workflows. Composes naturally with the existing ``HERMES_TOOLS_SUBSET`` narrowing (#75/#87) — disable happens first, then subset narrows further if set. ## Tests - 4 source-level tests in ``tests/agent/test_t3_clarify_default_out.py``: patch-landed, explicit-enable-overrides-implicit-disable, re-enable hint visible in print message, strict-equality on provider name (no prefix/alias matching to avoid surprise on related providers). - 21 total green across affected suites (T3 + diag-env-gate). ## Composition Per #143 sequencing (T3 → T1 → T2-gated → T2-default-flip): - This PR: T3 (clarify default-out) - Next: T1 (HERMES_DEFER_PERSONA default-flip for devagentic-local) - Then: T2 (empty-content recovery removal, env-gated then default) - Later: T4-T6 (tool list / iteration cap / summary fallback) ## Preserved through the refactor - PR #119 (cascade_exhausted short-circuit) — hermes correctly deferring to devagentic; NOT recovery - PR #122/#125 (raw tool_calls fallback) — pre-recovery wire parsing; belt-and-suspenders against future streaming-chunker regressions - PR #131/#136/#138/#141 diagnostics — env-gated via HERMES_DIAG_RAW_CAPTURE; no-op when off
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #74. Quick unblock for tool-paralysis at the integration ceiling surfaced on polynomial-explorer (gist: https://gist.github.com/PowerCreek/833cda14a6528f031fcc334305e56c63). Bridges to NousResearch#210 R1/R2 dynamic per-turn routing — same hook point in
agent.toolsconstruction; classifier replaces the env var when R1/R2 ship.Summary
HERMES_TOOLS_SUBSET— comma-separated allow-list of tool names. When set,agent.toolsis filtered to only those tools at session boot.agent/agent_init.pyimmediately afterget_tool_definitions(...)returns, beforevalid_tool_namesrecomputation.--enable-toolset/--disable-toolsetflags (toolset filter runs first; this further narrows what would otherwise be enabled).Operator usage
When narrowing is active and
quiet_mode=False, one INFO line at session start:Test plan
tests/agent/test_tools_subset.pycover: env unset / empty / whitespace-only → no filter, comma-separated narrowing, whitespace-in-names stripping, unknown-names silently ignored, zero-match → empty tools, quiet mode suppresses log, non-quiet mode logs narrowing + numbers +<empty>marker, degenerate (no tools to begin with) → noop, patch-landed-correctly smoke (asserts production code path contains the filter + the issue ref + the ordering invariant that filter runs beforevalid_tool_namesrecomputation).Why now
Notes
agent.valid_tool_names(computed immediately after the filter block) automatically reflects the narrowed set — no extra change needed for downstream validation.HERMES_TOOLS_SUBSET=xyz,abcwhere neither matches yields an emptyagent.tools— intentional. Operator may want a no-tools session via subset (rare but valid).Deploy
Hermes-side change → needs the same container rebuild as #68/#69 to land on devbox (G6/#66 deploy gap). Companion bash cliff-probe script (gist comment 6165358) helps operators tune per-model subset sizes once deployed.