Skip to content

feat(env): HERMES_TOOLS_SUBSET — operator-side tool surface narrowing (closes #74)#75

Merged
PowerCreek merged 1 commit into
mainfrom
issue-74-tools-subset
May 24, 2026
Merged

feat(env): HERMES_TOOLS_SUBSET — operator-side tool surface narrowing (closes #74)#75
PowerCreek merged 1 commit into
mainfrom
issue-74-tools-subset

Conversation

@PowerCreek

Copy link
Copy Markdown

Closes #74. Quick unblock for tool-paralysis at the integration ceiling surfaced on polynomial-explorer (gist: https://gist.github.com/PowerCreek/833cda14a6528f031fcc334305e56c63). Bridges to NousResearch#210 R1/R2 dynamic per-turn routing — same hook point in agent.tools construction; classifier replaces the env var when R1/R2 ship.

Summary

  • New env var HERMES_TOOLS_SUBSET — comma-separated allow-list of tool names. When set, agent.tools is filtered to only those tools at session boot.
  • Filter sits in agent/agent_init.py immediately after get_tool_definitions(...) returns, before valid_tool_names recomputation.
  • Composes with existing --enable-toolset / --disable-toolset flags (toolset filter runs first; this further narrows what would otherwise be enabled).
  • Empty / unset env preserves current behavior (no filtering, all loaded tools attached).

Operator usage

# Polynomial-explorer in observation mode (read-only):
HERMES_TOOLS_SUBSET=grafted_context_fetch,lane_h_list,lane_h_fetch,doc_search,silo_query hermes

# Polynomial-explorer in execution mode (read + confer + write):
HERMES_TOOLS_SUBSET=grafted_context_fetch,doc_search,doc_write,silo_query,confer_run,file_issue hermes

# Default profile dev work (no narrowing):
hermes  # all loaded tools attached as today

When narrowing is active and quiet_mode=False, one INFO line at session start:

🎯 HERMES_TOOLS_SUBSET narrowed tool surface: 33 → 5 (confer_run, doc_search, doc_write, grafted_context_fetch, silo_query)

Test plan

  • 12 tests in tests/agent/test_tools_subset.py cover: env unset / empty / whitespace-only → no filter, comma-separated narrowing, whitespace-in-names stripping, unknown-names silently ignored, zero-match → empty tools, quiet mode suppresses log, non-quiet mode logs narrowing + numbers + <empty> marker, degenerate (no tools to begin with) → noop, patch-landed-correctly smoke (asserts production code path contains the filter + the issue ref + the ordering invariant that filter runs before valid_tool_names recomputation).
  • All 12 pass.

Why now

Notes

  • Names not present in the registry are silently ignored. Plugins can add/remove tools at runtime; pre-validation against a stale registry would over-warn.
  • Filter is intentionally case-sensitive — tool names are exact strings; matching is whitespace-stripped per name only.
  • agent.valid_tool_names (computed immediately after the filter block) automatically reflects the narrowed set — no extra change needed for downstream validation.
  • Setting HERMES_TOOLS_SUBSET=xyz,abc where neither matches yields an empty agent.tools — intentional. Operator may want a no-tools session via subset (rare but valid).

Deploy

Hermes-side change → needs the same container rebuild as #68/#69 to land on devbox (G6/#66 deploy gap). Companion bash cliff-probe script (gist comment 6165358) helps operators tune per-model subset sizes once deployed.

@PowerCreek PowerCreek merged commit ab5d81f into main May 24, 2026
@PowerCreek PowerCreek deleted the issue-74-tools-subset branch May 24, 2026 06:24
PowerCreek added a commit that referenced this pull request May 24, 2026
…loses #86) (#87)

Post-#82/#83/#85 the autowire surfaces ~36 MCP tools from
hermes-internal, pushing fresh workers to 52 total tools and into the
tool-paralysis ceiling (text responses with affirmation pattern, no
tool_call emission for verticals that should be one-shot).

#75 (HERMES_TOOLS_SUBSET) was supposed to let operators narrow the
surface per worker, but it had three gaps for MCP tools:

1. Subset filtering only ran in agent_init.py:838 (built-in tool path)
2. cli.py:9790 (/reload-mcp + auto-reload on config change) re-assigns
   agent.tools without re-applying the filter — regression risk where
   any post-init MCP server reload nukes the subset
3. Even when filtering at agent.tools, the registry still carried the
   unwanted MCP tools, costing schema-conversion + collision-check
   work per tool per boot

Fix: apply the allow-list at MCP tool registration in
tools/mcp_tool.py::_register_server_tools. Excluded tools never enter
the registry, so both initial discovery AND /reload-mcp paths honor
the subset uniformly + the registry stays clean.

Three edits:

1. hermes_cli/tool_subset.py (new) — shared helpers `get_subset_allow()`
   + `is_tool_allowed()`. Single source of truth for env parsing +
   allow-list semantics so the two call sites (agent_init + mcp_tool)
   can't drift in casing/whitespace/empty-vs-missing handling.

2. tools/mcp_tool.py::_register_server_tools — read the subset once
   per server registration (O(1) check per tool, not O(N) env parse),
   apply inside both the per-tool loop AND the utility-tools loop
   (list_resources/read_resource/list_prompts/get_prompt).

3. agent/agent_init.py — refactored the inline #75 filter to use the
   shared helper. Behavior unchanged; dedupes parsing logic.

Subset compares against the **prefixed** MCP name
(`mcp_<server>_<tool>`) — exact match, predictable behavior. Fuzzy /
unprefixed matching is a separate feature request.

Tests (tests/test_mcp_subset_filter.py — 14 cases, all pass):
- get_subset_allow: unset/empty/whitespace-only/single/multi/padded/
  MCP-prefixed parse cases
- is_tool_allowed: None passes through, exact match, exclusion, no
  substring matching, MCP prefixed vs unprefixed (contract test)
- Source-level: _register_server_tools imports the shared helper +
  calls it inside main loop
- Source-level: same call inside the utility-tools loop (regression
  catcher for "fix the main loop, forget utility tools")
- Source-level: agent_init.py uses shared helper, no stale inline
  parser

37 total green (14 new + 20 from #83 + 3 from #85).

Final layer of the post-#67 cascade. After this lands + container
rebuild, workers can be configured with HERMES_TOOLS_SUBSET to a 5-tool
surface that includes their specific MCP needs (e.g.
"doc_view,mcp_hermes-internal_grafted_context_fetch,..."), out of
tool-paralysis range.
PowerCreek added a commit that referenced this pull request May 25, 2026
#96)

Operators who narrow the tool surface via HERMES_TOOLS_SUBSET can
now confirm at ``hermes doctor`` time exactly which tools the
filter parsed to. Catches two failure modes that previously
required a separate ``hermes mcp list`` diff:

1. Operator typoed a tool name → still in the parsed list (no
   cross-check), but the diff against ``hermes mcp list`` is now
   trivial.
2. Operator forgot the ``mcp_<server>_<tool>`` prefix for MCP
   tools → no entry uses ``mcp_`` prefix but entries look
   structured → info reminder fires.

Silent when env var is unset/empty (silent-when-irrelevant pattern
from the #88/#53/#54 doctor probes). When set, surfaces:

  * check_ok with the count + a sample of names (first 6, then
    ``+N more`` suffix to keep the row readable);
  * check_info reminder when zero entries use the mcp_ prefix but
    some look structured (the most common parse-correctly-but-
    filter-nothing failure mode).

Cross-check against the live MCP registry was considered + rejected
for this PR — it would require spinning up ``create_mcp_server()``
at probe time. Operators can ``hermes mcp list`` separately if they
want the full diff. Filing as opportunistic follow-up if demand
shows up.

## Tests

- 8 new tests in tests/hermes_cli/test_doctor_tools_subset_probe.py:
  silent-when-unset / silent-when-empty / silent-when-whitespace /
  count-and-sample-shown / long-list-truncated / mcp-prefix-reminder
  / no-reminder-when-mcp-present / no-reminder-when-only-simple-bare-
  names.
- 30 total green across affected suites (probe + provider-env-probe
  + mcp_subset_filter). No regression.
PowerCreek added a commit that referenced this pull request May 28, 2026
…143) (#145)

T3 of the #143 thin-client refactor scope. When the active provider
is ``devagentic-local``, augment ``disabled_toolsets`` with
``"clarify"`` before the ``get_tool_definitions`` call.

## Rationale

Per devagentic#203 §1.3 + the #143 scope: devagentic-side intent
classifier knows when clarification is actually needed and can
surface it via an OpenAI-shaped assistant message. Hermes' modal
TUI clarify-tool was a layered opinion fighting devagentic-side
classification — both rotation's debug evidence + sandbox UX
showed the dual-source as confusing (clarify modal popped even
with --yolo, ignoring devagentic-side intent signals).

This is the smallest of the T1-T3 sequence and the cleanest
revert path — purely a tool-registry adjustment for one provider.

## Behavior

| Setting | Before | After |
|---|---|---|
| provider=devagentic-local, no --enable-toolset | clarify enabled | clarify implicitly disabled |
| provider=devagentic-local, --enable-toolset clarify | clarify enabled | clarify enabled (explicit override) |
| provider=other | unchanged | unchanged |

A boot-line print informs operators of the implicit disable + how
to re-enable for legacy workflows. Composes naturally with the
existing ``HERMES_TOOLS_SUBSET`` narrowing (#75/#87) — disable
happens first, then subset narrows further if set.

## Tests

- 4 source-level tests in
  ``tests/agent/test_t3_clarify_default_out.py``: patch-landed,
  explicit-enable-overrides-implicit-disable, re-enable hint
  visible in print message, strict-equality on provider name
  (no prefix/alias matching to avoid surprise on related
  providers).
- 21 total green across affected suites (T3 + diag-env-gate).

## Composition

Per #143 sequencing (T3 → T1 → T2-gated → T2-default-flip):
- This PR: T3 (clarify default-out)
- Next: T1 (HERMES_DEFER_PERSONA default-flip for devagentic-local)
- Then: T2 (empty-content recovery removal, env-gated then default)
- Later: T4-T6 (tool list / iteration cap / summary fallback)

## Preserved through the refactor

- PR #119 (cascade_exhausted short-circuit) — hermes correctly
  deferring to devagentic; NOT recovery
- PR #122/#125 (raw tool_calls fallback) — pre-recovery wire
  parsing; belt-and-suspenders against future streaming-chunker
  regressions
- PR #131/#136/#138/#141 diagnostics — env-gated via
  HERMES_DIAG_RAW_CAPTURE; no-op when off
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

HERMES_TOOLS_SUBSET env var — worker-side tool gating (bridge to #210 R1/R2)

1 participant