feat(env): HERMES_TOOLS_SUBSET — operator-side tool surface narrowing (closes #74) by PowerCreek · Pull Request #75 · TechDevGroup/hermes-agent

PowerCreek · 2026-05-24T06:20:40Z

Closes #74. Quick unblock for tool-paralysis at the integration ceiling surfaced on polynomial-explorer (gist: https://gist.github.com/PowerCreek/833cda14a6528f031fcc334305e56c63). Bridges to NousResearch#210 R1/R2 dynamic per-turn routing — same hook point in agent.tools construction; classifier replaces the env var when R1/R2 ship.

Summary

New env var HERMES_TOOLS_SUBSET — comma-separated allow-list of tool names. When set, agent.tools is filtered to only those tools at session boot.
Filter sits in agent/agent_init.py immediately after get_tool_definitions(...) returns, before valid_tool_names recomputation.
Composes with existing --enable-toolset / --disable-toolset flags (toolset filter runs first; this further narrows what would otherwise be enabled).
Empty / unset env preserves current behavior (no filtering, all loaded tools attached).

Operator usage

# Polynomial-explorer in observation mode (read-only):
HERMES_TOOLS_SUBSET=grafted_context_fetch,lane_h_list,lane_h_fetch,doc_search,silo_query hermes

# Polynomial-explorer in execution mode (read + confer + write):
HERMES_TOOLS_SUBSET=grafted_context_fetch,doc_search,doc_write,silo_query,confer_run,file_issue hermes

# Default profile dev work (no narrowing):
hermes  # all loaded tools attached as today

When narrowing is active and quiet_mode=False, one INFO line at session start:

🎯 HERMES_TOOLS_SUBSET narrowed tool surface: 33 → 5 (confer_run, doc_search, doc_write, grafted_context_fetch, silo_query)

Test plan

12 tests in tests/agent/test_tools_subset.py cover: env unset / empty / whitespace-only → no filter, comma-separated narrowing, whitespace-in-names stripping, unknown-names silently ignored, zero-match → empty tools, quiet mode suppresses log, non-quiet mode logs narrowing + numbers + <empty> marker, degenerate (no tools to begin with) → noop, patch-landed-correctly smoke (asserts production code path contains the filter + the issue ref + the ordering invariant that filter runs before valid_tool_names recomputation).
All 12 pass.

Why now

Unblocks polynomial-explorer immediately. Direct-API probe showed N=1 tool → perfect tool_call; hermes attaching 33 tools per turn → empty across mistral-large + coding-groq. Operator narrows to 5-7 essential tools per session; worker actually makes progress on its mission.
Composes with already-shipped fixes:
- G1 preamble doesn't scale past ~10 grafts even with caps — needs carving / lazy-load / summarization #71 lazy-load preamble (PR Path traversal in skill_view allows reading arbitrary files including API keys NousResearch/hermes-agent#220/feat(mcp): grafted_context_fetch tool — extends devagentic-lane-h plugin (#71 PR2) #72/refactor(preamble): G1 renders graft INDEX only — workers fetch on demand (closes #71 PR3) #73, in Review): cuts graft-content from per-turn request.
- This PR: cuts tool-schema from per-turn request.
- fix(conversation): structural-empty terminal recovery for stop+no-tool_calls+tools (closes #67) #69 structural-empty recovery: catches any residual empties as last-resort.
- /retry and /undo gateway commands have no effect, /reset silently loses memories NousResearch/hermes-agent#210 R1/R2 (parent, future): replaces this env var with dynamic per-turn classifier.

Notes

Names not present in the registry are silently ignored. Plugins can add/remove tools at runtime; pre-validation against a stale registry would over-warn.
Filter is intentionally case-sensitive — tool names are exact strings; matching is whitespace-stripped per name only.
agent.valid_tool_names (computed immediately after the filter block) automatically reflects the narrowed set — no extra change needed for downstream validation.
Setting HERMES_TOOLS_SUBSET=xyz,abc where neither matches yields an empty agent.tools — intentional. Operator may want a no-tools session via subset (rare but valid).

Deploy

Hermes-side change → needs the same container rebuild as #68/#69 to land on devbox (G6/#66 deploy gap). Companion bash cliff-probe script (gist comment 6165358) helps operators tune per-model subset sizes once deployed.

…closes #74)

…loses #86) (#87) Post-#82/#83/#85 the autowire surfaces ~36 MCP tools from hermes-internal, pushing fresh workers to 52 total tools and into the tool-paralysis ceiling (text responses with affirmation pattern, no tool_call emission for verticals that should be one-shot). #75 (HERMES_TOOLS_SUBSET) was supposed to let operators narrow the surface per worker, but it had three gaps for MCP tools: 1. Subset filtering only ran in agent_init.py:838 (built-in tool path) 2. cli.py:9790 (/reload-mcp + auto-reload on config change) re-assigns agent.tools without re-applying the filter — regression risk where any post-init MCP server reload nukes the subset 3. Even when filtering at agent.tools, the registry still carried the unwanted MCP tools, costing schema-conversion + collision-check work per tool per boot Fix: apply the allow-list at MCP tool registration in tools/mcp_tool.py::_register_server_tools. Excluded tools never enter the registry, so both initial discovery AND /reload-mcp paths honor the subset uniformly + the registry stays clean. Three edits: 1. hermes_cli/tool_subset.py (new) — shared helpers `get_subset_allow()` + `is_tool_allowed()`. Single source of truth for env parsing + allow-list semantics so the two call sites (agent_init + mcp_tool) can't drift in casing/whitespace/empty-vs-missing handling. 2. tools/mcp_tool.py::_register_server_tools — read the subset once per server registration (O(1) check per tool, not O(N) env parse), apply inside both the per-tool loop AND the utility-tools loop (list_resources/read_resource/list_prompts/get_prompt). 3. agent/agent_init.py — refactored the inline #75 filter to use the shared helper. Behavior unchanged; dedupes parsing logic. Subset compares against the **prefixed** MCP name (`mcp_<server>_<tool>`) — exact match, predictable behavior. Fuzzy / unprefixed matching is a separate feature request. Tests (tests/test_mcp_subset_filter.py — 14 cases, all pass): - get_subset_allow: unset/empty/whitespace-only/single/multi/padded/ MCP-prefixed parse cases - is_tool_allowed: None passes through, exact match, exclusion, no substring matching, MCP prefixed vs unprefixed (contract test) - Source-level: _register_server_tools imports the shared helper + calls it inside main loop - Source-level: same call inside the utility-tools loop (regression catcher for "fix the main loop, forget utility tools") - Source-level: agent_init.py uses shared helper, no stale inline parser 37 total green (14 new + 20 from #83 + 3 from #85). Final layer of the post-#67 cascade. After this lands + container rebuild, workers can be configured with HERMES_TOOLS_SUBSET to a 5-tool surface that includes their specific MCP needs (e.g. "doc_view,mcp_hermes-internal_grafted_context_fetch,..."), out of tool-paralysis range.

#96) Operators who narrow the tool surface via HERMES_TOOLS_SUBSET can now confirm at ``hermes doctor`` time exactly which tools the filter parsed to. Catches two failure modes that previously required a separate ``hermes mcp list`` diff: 1. Operator typoed a tool name → still in the parsed list (no cross-check), but the diff against ``hermes mcp list`` is now trivial. 2. Operator forgot the ``mcp_<server>_<tool>`` prefix for MCP tools → no entry uses ``mcp_`` prefix but entries look structured → info reminder fires. Silent when env var is unset/empty (silent-when-irrelevant pattern from the #88/#53/#54 doctor probes). When set, surfaces: * check_ok with the count + a sample of names (first 6, then ``+N more`` suffix to keep the row readable); * check_info reminder when zero entries use the mcp_ prefix but some look structured (the most common parse-correctly-but- filter-nothing failure mode). Cross-check against the live MCP registry was considered + rejected for this PR — it would require spinning up ``create_mcp_server()`` at probe time. Operators can ``hermes mcp list`` separately if they want the full diff. Filing as opportunistic follow-up if demand shows up. ## Tests - 8 new tests in tests/hermes_cli/test_doctor_tools_subset_probe.py: silent-when-unset / silent-when-empty / silent-when-whitespace / count-and-sample-shown / long-list-truncated / mcp-prefix-reminder / no-reminder-when-mcp-present / no-reminder-when-only-simple-bare- names. - 30 total green across affected suites (probe + provider-env-probe + mcp_subset_filter). No regression.

…143) (#145) T3 of the #143 thin-client refactor scope. When the active provider is ``devagentic-local``, augment ``disabled_toolsets`` with ``"clarify"`` before the ``get_tool_definitions`` call. ## Rationale Per devagentic#203 §1.3 + the #143 scope: devagentic-side intent classifier knows when clarification is actually needed and can surface it via an OpenAI-shaped assistant message. Hermes' modal TUI clarify-tool was a layered opinion fighting devagentic-side classification — both rotation's debug evidence + sandbox UX showed the dual-source as confusing (clarify modal popped even with --yolo, ignoring devagentic-side intent signals). This is the smallest of the T1-T3 sequence and the cleanest revert path — purely a tool-registry adjustment for one provider. ## Behavior | Setting | Before | After | |---|---|---| | provider=devagentic-local, no --enable-toolset | clarify enabled | clarify implicitly disabled | | provider=devagentic-local, --enable-toolset clarify | clarify enabled | clarify enabled (explicit override) | | provider=other | unchanged | unchanged | A boot-line print informs operators of the implicit disable + how to re-enable for legacy workflows. Composes naturally with the existing ``HERMES_TOOLS_SUBSET`` narrowing (#75/#87) — disable happens first, then subset narrows further if set. ## Tests - 4 source-level tests in ``tests/agent/test_t3_clarify_default_out.py``: patch-landed, explicit-enable-overrides-implicit-disable, re-enable hint visible in print message, strict-equality on provider name (no prefix/alias matching to avoid surprise on related providers). - 21 total green across affected suites (T3 + diag-env-gate). ## Composition Per #143 sequencing (T3 → T1 → T2-gated → T2-default-flip): - This PR: T3 (clarify default-out) - Next: T1 (HERMES_DEFER_PERSONA default-flip for devagentic-local) - Then: T2 (empty-content recovery removal, env-gated then default) - Later: T4-T6 (tool list / iteration cap / summary fallback) ## Preserved through the refactor - PR #119 (cascade_exhausted short-circuit) — hermes correctly deferring to devagentic; NOT recovery - PR #122/#125 (raw tool_calls fallback) — pre-recovery wire parsing; belt-and-suspenders against future streaming-chunker regressions - PR #131/#136/#138/#141 diagnostics — env-gated via HERMES_DIAG_RAW_CAPTURE; no-op when off

feat(env): HERMES_TOOLS_SUBSET — operator-side tool surface narrowing (…

1a5615d

…closes #74)

PowerCreek merged commit ab5d81f into main May 24, 2026

PowerCreek deleted the issue-74-tools-subset branch May 24, 2026 06:24

This was referenced May 24, 2026

HERMES_TOOLS_SUBSET doesn't filter MCP-server-provided tools #86

Closed

feat(mcp): extend HERMES_TOOLS_SUBSET to filter MCP tools (closes #86) #87

Merged

PowerCreek mentioned this pull request May 25, 2026

Direction A scoping: per-intent system-prompt narrowing (#89 follow-up; composes with devagentic#237) #97

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(env): HERMES_TOOLS_SUBSET — operator-side tool surface narrowing (closes #74)#75

feat(env): HERMES_TOOLS_SUBSET — operator-side tool surface narrowing (closes #74)#75
PowerCreek merged 1 commit into
mainfrom
issue-74-tools-subset

PowerCreek commented May 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

PowerCreek commented May 24, 2026

Summary

Operator usage

Test plan

Why now

Notes

Deploy

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant