feat(mcp): extend HERMES_TOOLS_SUBSET to filter MCP tools (closes #86)#87
Merged
Conversation
…loses #86) Post-#82/#83/#85 the autowire surfaces ~36 MCP tools from hermes-internal, pushing fresh workers to 52 total tools and into the tool-paralysis ceiling (text responses with affirmation pattern, no tool_call emission for verticals that should be one-shot). #75 (HERMES_TOOLS_SUBSET) was supposed to let operators narrow the surface per worker, but it had three gaps for MCP tools: 1. Subset filtering only ran in agent_init.py:838 (built-in tool path) 2. cli.py:9790 (/reload-mcp + auto-reload on config change) re-assigns agent.tools without re-applying the filter — regression risk where any post-init MCP server reload nukes the subset 3. Even when filtering at agent.tools, the registry still carried the unwanted MCP tools, costing schema-conversion + collision-check work per tool per boot Fix: apply the allow-list at MCP tool registration in tools/mcp_tool.py::_register_server_tools. Excluded tools never enter the registry, so both initial discovery AND /reload-mcp paths honor the subset uniformly + the registry stays clean. Three edits: 1. hermes_cli/tool_subset.py (new) — shared helpers `get_subset_allow()` + `is_tool_allowed()`. Single source of truth for env parsing + allow-list semantics so the two call sites (agent_init + mcp_tool) can't drift in casing/whitespace/empty-vs-missing handling. 2. tools/mcp_tool.py::_register_server_tools — read the subset once per server registration (O(1) check per tool, not O(N) env parse), apply inside both the per-tool loop AND the utility-tools loop (list_resources/read_resource/list_prompts/get_prompt). 3. agent/agent_init.py — refactored the inline #75 filter to use the shared helper. Behavior unchanged; dedupes parsing logic. Subset compares against the **prefixed** MCP name (`mcp_<server>_<tool>`) — exact match, predictable behavior. Fuzzy / unprefixed matching is a separate feature request. Tests (tests/test_mcp_subset_filter.py — 14 cases, all pass): - get_subset_allow: unset/empty/whitespace-only/single/multi/padded/ MCP-prefixed parse cases - is_tool_allowed: None passes through, exact match, exclusion, no substring matching, MCP prefixed vs unprefixed (contract test) - Source-level: _register_server_tools imports the shared helper + calls it inside main loop - Source-level: same call inside the utility-tools loop (regression catcher for "fix the main loop, forget utility tools") - Source-level: agent_init.py uses shared helper, no stale inline parser 37 total green (14 new + 20 from #83 + 3 from #85). Final layer of the post-#67 cascade. After this lands + container rebuild, workers can be configured with HERMES_TOOLS_SUBSET to a 5-tool surface that includes their specific MCP needs (e.g. "doc_view,mcp_hermes-internal_grafted_context_fetch,..."), out of tool-paralysis range.
This was referenced May 24, 2026
Closed
PowerCreek
added a commit
that referenced
this pull request
May 25, 2026
#96) Operators who narrow the tool surface via HERMES_TOOLS_SUBSET can now confirm at ``hermes doctor`` time exactly which tools the filter parsed to. Catches two failure modes that previously required a separate ``hermes mcp list`` diff: 1. Operator typoed a tool name → still in the parsed list (no cross-check), but the diff against ``hermes mcp list`` is now trivial. 2. Operator forgot the ``mcp_<server>_<tool>`` prefix for MCP tools → no entry uses ``mcp_`` prefix but entries look structured → info reminder fires. Silent when env var is unset/empty (silent-when-irrelevant pattern from the #88/#53/#54 doctor probes). When set, surfaces: * check_ok with the count + a sample of names (first 6, then ``+N more`` suffix to keep the row readable); * check_info reminder when zero entries use the mcp_ prefix but some look structured (the most common parse-correctly-but- filter-nothing failure mode). Cross-check against the live MCP registry was considered + rejected for this PR — it would require spinning up ``create_mcp_server()`` at probe time. Operators can ``hermes mcp list`` separately if they want the full diff. Filing as opportunistic follow-up if demand shows up. ## Tests - 8 new tests in tests/hermes_cli/test_doctor_tools_subset_probe.py: silent-when-unset / silent-when-empty / silent-when-whitespace / count-and-sample-shown / long-list-truncated / mcp-prefix-reminder / no-reminder-when-mcp-present / no-reminder-when-only-simple-bare- names. - 30 total green across affected suites (probe + provider-env-probe + mcp_subset_filter). No regression.
PowerCreek
added a commit
that referenced
this pull request
May 28, 2026
…143) (#145) T3 of the #143 thin-client refactor scope. When the active provider is ``devagentic-local``, augment ``disabled_toolsets`` with ``"clarify"`` before the ``get_tool_definitions`` call. ## Rationale Per devagentic#203 §1.3 + the #143 scope: devagentic-side intent classifier knows when clarification is actually needed and can surface it via an OpenAI-shaped assistant message. Hermes' modal TUI clarify-tool was a layered opinion fighting devagentic-side classification — both rotation's debug evidence + sandbox UX showed the dual-source as confusing (clarify modal popped even with --yolo, ignoring devagentic-side intent signals). This is the smallest of the T1-T3 sequence and the cleanest revert path — purely a tool-registry adjustment for one provider. ## Behavior | Setting | Before | After | |---|---|---| | provider=devagentic-local, no --enable-toolset | clarify enabled | clarify implicitly disabled | | provider=devagentic-local, --enable-toolset clarify | clarify enabled | clarify enabled (explicit override) | | provider=other | unchanged | unchanged | A boot-line print informs operators of the implicit disable + how to re-enable for legacy workflows. Composes naturally with the existing ``HERMES_TOOLS_SUBSET`` narrowing (#75/#87) — disable happens first, then subset narrows further if set. ## Tests - 4 source-level tests in ``tests/agent/test_t3_clarify_default_out.py``: patch-landed, explicit-enable-overrides-implicit-disable, re-enable hint visible in print message, strict-equality on provider name (no prefix/alias matching to avoid surprise on related providers). - 21 total green across affected suites (T3 + diag-env-gate). ## Composition Per #143 sequencing (T3 → T1 → T2-gated → T2-default-flip): - This PR: T3 (clarify default-out) - Next: T1 (HERMES_DEFER_PERSONA default-flip for devagentic-local) - Then: T2 (empty-content recovery removal, env-gated then default) - Later: T4-T6 (tool list / iteration cap / summary fallback) ## Preserved through the refactor - PR #119 (cascade_exhausted short-circuit) — hermes correctly deferring to devagentic; NOT recovery - PR #122/#125 (raw tool_calls fallback) — pre-recovery wire parsing; belt-and-suspenders against future streaming-chunker regressions - PR #131/#136/#138/#141 diagnostics — env-gated via HERMES_DIAG_RAW_CAPTURE; no-op when off
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #86.
Bug
Post-#82/#83/#85, autowire surfaces ~36 MCP tools from
hermes-internal, pushing fresh workers to 52 total tools — over the tool-paralysis ceiling. Workers produce text-with-affirmation responses instead of emittingtool_callfor verticals that should be one-shot.HERMES_TOOLS_SUBSET(#75) was supposed to let operators narrow the surface per worker, but had three gaps for MCP tools:agent_init.py:838(built-in tool path)cli.py:9790(/reload-mcp+ auto-reload on config change) re-assignsagent.toolswithout re-applying the filter — regression riskagent.tools, the registry still carried unwanted MCP tools, paying schema-conversion + collision-check cost per tool per bootFix
Apply the allow-list at MCP tool registration in
tools/mcp_tool.py::_register_server_tools. Excluded tools never enter the registry, so both initial discovery AND/reload-mcppaths honor the subset uniformly + registry stays clean.Three edits
hermes_cli/tool_subset.py(new) — shared helpersget_subset_allow()+is_tool_allowed(). Single source of truth for env parsing + allow-list semantics so the two call sites can't drift.tools/mcp_tool.py::_register_server_tools— reads subset once per server registration (O(1) per-tool check, not O(N) env parse), applied inside both the per-tool loop AND the utility-tools loop (list_resources/read_resource/list_prompts/get_prompt).agent/agent_init.py— refactored the inline feat(env): HERMES_TOOLS_SUBSET — operator-side tool surface narrowing (closes #74) #75 filter to use the shared helper. Behavior unchanged; dedupes parsing logic.Contract
Subset compares against the prefixed MCP name (
mcp_<server>_<tool>) — exact match, predictable behavior. Fuzzy/unprefixed matching is a separate feature request.Example:
HERMES_TOOLS_SUBSET=doc_view,mcp_hermes-internal_grafted_context_fetch→ registry contains exactly those 2 tools.Tests (
tests/test_mcp_subset_filter.py— 14 cases, all pass)get_subset_allow: unset/empty/whitespace-only/single/multi/padded/MCP-prefixed parse casesis_tool_allowed: None passes through, exact match, exclusion, no substring matching, MCP prefixed vs unprefixed contract test_register_server_toolsimports the shared helper + calls it inside main loopagent_init.pyuses shared helper, no stale inline parser37 total green in MCP-cascade suite (14 new + 20 from #83 + 3 from #85).
Final layer of the post-#67 cascade
After merge + container rebuild, workers can be spawned with e.g.
HERMES_TOOLS_SUBSET="doc_view,mcp_hermes-internal_grafted_context_fetch,mcp_hermes-internal_lane_h_list,mcp_hermes-internal_silo_query,kanban_done"— narrow surface that includes the specific MCP tools the vertical needs, out of tool-paralysis range.Test plan
HERMES_TOOLS_SUBSET="kanban_done,mcp_hermes-internal_grafted_context_fetch"/toolsoutput or worker boot banner)tool_callagainstgrafted_context_fetchrather than text affirmation/reload-mcppost-boot → subset still respectedHERMES_TOOLS_SUBSET→ all tools register (baseline preserved)🤖 Generated with Claude Code