Skip to content

fix(tools): coerce stringified booleans/integers in MCP tool args#37101

Open
davidgut1982 wants to merge 14 commits into
NousResearch:mainfrom
davidgut1982:feat/mcp-boolean-coercion
Open

fix(tools): coerce stringified booleans/integers in MCP tool args#37101
davidgut1982 wants to merge 14 commits into
NousResearch:mainfrom
davidgut1982:feat/mcp-boolean-coercion

Conversation

@davidgut1982

@davidgut1982 davidgut1982 commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

What

Adds _coerce_args_to_schema() function that schema-walks a tool call's arguments and coerces stringified booleans ("true"/"false" → Python bool) and numeric strings ("0", "1"int/float) before dispatch. Fixes MCP servers (e.g. sequentialthinking) that reject string values where their JSON Schema declares "type": "boolean" or "type": "integer". No behavior change for arguments that already have correct types.

Files modified: tools/mcp_tool.py, tools/delegate_tool.py.

Why

MCP transport sends JSON over the wire, but some models emit stringified booleans and integers instead of native JSON types. When the schema says a parameter must be boolean, a string "true" causes the tool to reject the call. This coercion layer bridges the gap transparently.

Tests

scripts/run_tests.sh tests/tools/

All existing tests pass; coercion is transparent for already-correct types.

Platforms tested

Linux (CT/LXC environment, Python 3.13)

davidgut1982 and others added 14 commits May 30, 2026 17:05
…isclosure

Adds an optional, opt-in embedding reranker to the tool_search BM25 bridge
(PR NousResearch#34493). Default OFF — when disabled the BM25 path is byte-for-byte
identical to upstream. urllib-only (no new deps), task-prefixed, md5-cached
tool embeddings, full-catalog retrieve, rerank/RRF(k=10) modes, graceful
BM25 fallback on any endpoint failure. Backend is any OpenAI-compatible
/v1/embeddings endpoint (cloud, local CPU, or GPU).

Live-validated (194 tools / 98 labeled queries, nomic-embed-text-v2-moe):
overall Recall@5 0.617 -> 0.810, SEMANTIC 0.500 -> 0.849, LEXICAL preserved
at 1.000; warm per-query ~146ms, dead-endpoint fallback ~8ms.

Fulfills NousResearch#13332.
When delegate_task is called with a named agent_profile, MCP toolsets
declared in the profile now resolve from global mcp_servers config
rather than being filtered against the parent agent's loaded tools.

Previously, if the orchestrator restricted its own MCP context (via
no_mcp or simply not loading domain servers), child agents spawned with
profile toolsets like ["mcp-fastmail"] received empty tool lists
silently. The intersection logic in _build_child_agent() treated
parent-loaded tools as the upper bound for all children.

This fix adds a profile_name parameter to _build_child_agent(). When
set, _is_mcp_toolset_name() gates MCP toolsets through unconditionally;
non-MCP toolsets still require parent membership (security boundary
preserved). delegate_task() passes the resolved per-task profile name
through to _build_child_agent() at every call site.

Fixes the child-tool-loss failure mode described in issue NousResearch#32668.
Three regression tests added to test_delegate_toolset_scope.py.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…string

Add AGENTS.md note to the delegation section explaining that named
agent_profile toolsets bypass the parent intersection for MCP servers
(fix introduced in NousResearch#32668). Expand the _build_child_agent() docstring
to describe the profile_name parameter's semantics and the rationale
for the security-boundary split.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When the active platform includes the no_mcp sentinel in its toolsets,
skip eager MCP server discovery at gateway/CLI startup. Discovery is
deferred until the first delegate_task() call that targets an MCP toolset,
using a thread-safe one-shot Event/Lock pattern.

This eliminates unnecessary MCP connection overhead for api_server platform
(orchestrator) while preserving full MCP access for child agents via the
Phase 1 profile_name bypass. cli/cron/telegram platforms are unaffected:
their toolsets lack no_mcp, so the gate evaluates False and eager discovery
runs exactly as before.

Changes:
- tools/mcp_tool.py: add mark_eager_discovery_skipped() + ensure_mcp_discovered()
  (one-shot, thread-safe, failure-tolerant lazy discovery trigger)
- gateway/run.py: add _active_platform_uses_no_mcp() helper; gate the eager
  discover_mcp_tools() call in start_gateway() on the platform no_mcp check
- hermes_cli/main.py: gate the inline CLI-startup discover_mcp_tools() call in
  _prepare_agent_startup() with the same no_mcp check (covers `gateway run`)
- tools/delegate_tool.py: call ensure_mcp_discovered() before building any
  child agent that requests MCP toolsets
- tests/tools/test_mcp_lazy_discovery.py: 12 tests covering the skip flag,
  no-op/once/idempotent/thread-safe/failure paths, and platform resolution

Part of fix/profile-mcp-toolset-bypass branch (stacked on Phase 1).
Resolves: NousResearch#32668

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pre-declare _cfg as Dict[str, Any] = {} so it is always bound and
Pyright knows the type is dict throughout the config bridge block.

Use an intermediate _expanded variable after _expand_env_vars() and
guard with isinstance(_, dict) so the re-assignment stays within the
declared dict type — _expand_env_vars has no return annotation and
Pyright infers a broad str | list | dict union.

Simplify the IPv4 network_cfg line: now that _cfg is always bound and
typed as dict, the old ('_cfg' in dir() else {}) guard is unnecessary.

Fixes Pyright errors at lines 863, 901, 917, 926, 930, 936, 978 that
were introduced by the Phase 2 lazy MCP discovery work.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…set_scope.py

Cherry-pick of 1342418 onto current main silently dropped the
`from unittest.mock import MagicMock, patch` line because both the
cherry-pick source and the HEAD file lacked it relative to the diff's
three-way base. The TestProfileMcpToolsetBypass class uses @patch and
MagicMock, so without this import collection fails with NameError.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ding churn (NousResearch#13332)

Replace the single-slot module-level reranker singleton (_reranker /
_reranker_catalog_key) with a bounded scope-keyed dict (_reranker_cache,
max 8 entries, FIFO eviction). Each distinct toolset-scope (keyed by
md5(endpoint + model + tool_names)) now retains its own EmbeddingReranker
instance and its own per-tool embedding cache independently of concurrent
agents operating on different toolsets.

Old behaviour: agent A (toolset X) and agent B (toolset Y) racing through
_get_reranker() caused the second call to rebuild the singleton and discard
the first agent's cached embeddings, forcing repeated endpoint calls.

New behaviour: both scopes coexist in the dict; re-requesting scope A after
scope B is created returns the original scope-A instance with its embedding
cache intact. Thread-safety is preserved via double-checked locking on the
dict + order list, guarded by the existing _GLOBAL_LOCK.

New tests (TestEmbedCacheInvalidation):
- test_concurrent_scopes_do_not_share_reranker: proves scope B creation does
  NOT evict scope A's instance or its embedding cache (mocks _embed and
  asserts zero extra calls on scope-A re-access after scope B is created).
- test_reranker_cache_evicts_oldest_scope_when_full: fills cache to
  _RERANKER_CACHE_MAX_SIZE (8), adds an overflow scope, and asserts FIFO
  eviction dropped the oldest key from _reranker_cache.

Existing tests updated to reset _reranker_cache / _reranker_cache_order
instead of the retired _reranker / _reranker_catalog_key globals.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…G reaches the log

_read_logging_config now returns the loggers dict (4th element).
setup_logging iterates logging.loggers.<name> entries after attaching
handlers and calls logging.getLogger(name).setLevel() for each valid
entry.  Both bare-string ("DEBUG") and dict ({level: "DEBUG"}) shapes
are accepted; invalid levels are silently skipped.

When a per-logger override is finer-grained than the configured
top-level (e.g. DEBUG while agent.log runs at INFO), the root logger
and the agent.log handler are also lowered to the minimum per-logger
level so that propagated records actually reach the file.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… toolsets bypass no_mcp parent intersection (NousResearch#32668/NousResearch#32727)

Under a no_mcp orchestrator (platform_toolsets: [no_mcp, delegation,
knowledge]), expanded_parent has zero MCP entries. The previous call site
hardcoded profile_name=None into _build_child_agent, so the MCP-bypass
branch (lines 972-976 of delegate_tool.py) never activated and fat
sub-agents received an empty toolset.

Fix:
- Add _load_agent_profiles() helper to read agent_profiles from the
  top-level config (not the delegation sub-key that _load_config() returns).
- Add profile: Optional[str] param to delegate_task(); when set, resolve
  the named profile's toolsets from agent_profiles and store as the
  effective toolsets for the child.
- Change the _build_child_agent call site from profile_name=None to
  profile_name=resolved_profile_name so the MCP-bypass branch activates
  for named profiles.
- Add "profile" to DELEGATE_TASK_SCHEMA so the model reliably emits the
  field; without a formal schema entry the LLM strips it at the provider
  API boundary.
- Update registry lambda and _dispatch_delegate_task in run_agent.py to
  forward profile= through both invocation paths.

Security: the bypass is scoped to MCP toolsets only, and only when a
named profile explicitly declares them in config. Non-MCP toolsets still
go through the parent intersection. Unknown profile names fall back
gracefully (warning + no bypass).

Tests added (test_delegate_toolset_scope.py):
- TestDelegateTaskProfileWiring: verifies delegate_task() calls
  _build_child_agent with profile_name='documents' and the profile's
  resolved toolsets. Key regression guard: FAILS against pre-fix code
  (profile_name=None hardcoded) and PASSES after the fix.
- TestDelegateTaskSchemaProfile: asserts 'profile' is a declared string
  property in DELEGATE_TASK_SCHEMA and is not in required[].
- TestProfileMcpBypassEndToEnd: direct _build_child_agent tests covering
  the post-fix bypass (profile_name set → mcp-nextcloud-files retained)
  and the security baseline (profile_name=None → MCP stripped).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…toolset injection (security)

Two privilege-escalation vectors closed:

1. Batch-mode injection (primary fix): the batch loop previously used
   `t.get("toolsets") or toolsets` for each task, so a model could name
   a valid profile (activating the mcp-* bypass in _build_child_agent)
   while supplying per-task toolsets that the profile never declared.
   Fix: when resolved_profile_name is set, all tasks in the batch receive
   profile_resolved_toolsets exclusively — per-task model-supplied toolsets
   are ignored.

2. Empty-profile bypass (secondary fix): a profile with no/empty toolsets
   still set resolved_profile_name, activating the mcp-* bypass with
   whatever caller-supplied toolsets were present. Fix: resolved_profile_name
   is only set when the profile declares a non-empty toolsets list; otherwise
   a warning is logged and the normal intersection path is used.

Tests added (tests/tools/test_delegate_toolset_scope.py):
- test_batch_injection_blocked_model_cannot_inject_evil_mcp_toolset: FAILS
  pre-fix (mcp-injected-evil present), PASSES post-fix
- test_single_task_profile_toolsets_unchanged: regression guard, PASSES both
- test_empty_profile_toolsets_bypass_not_activated: FAILS pre-fix
  (profile_name set), PASSES post-fix

Total: 16 → 19 tests, all green. tool_search: 63 pass, no collateral.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…l security hardening)

Assign list(profile_toolsets) instead of the bare reference at line ~2082
so in-place mutations of the working `toolsets` variable cannot corrupt
agent_profiles config for the process lifetime.  Also take an independent
list(toolsets) copy for profile_resolved_toolsets at line ~2110 (defence-
in-depth against future code inserted between the two assignments).

Adds TestProfileToolsetsAliasing::test_profile_toolsets_copy_prevents_config_corruption:
resolves a profile, mutates the child toolsets list returned to _build_child_agent,
and asserts the original profiles["documents"]["toolsets"] config list is unchanged.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rent MCP bleed (security hardening)

When a named agent_profile is used, the profile's declared toolsets are
authoritative. Previously _preserve_parent_mcp_toolsets ran unconditionally
when inherit_mcp_toolsets=True, silently appending every parent MCP toolset
absent from the child's list — including ones the profile never declared.

Guard the bleed path with `and not profile_name` so inheritance is skipped
whenever delegation resolves through a profile. Non-profile (ad-hoc)
delegation continues to inherit parent MCP toolsets unchanged.

The variable `profile_name` is already in scope at the call site (function
parameter, also used at line 972 for the existing MCP bypass guard).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds _coerce_args_to_schema() which walks the JSON Schema for a tool
call and converts "true"/"false" strings to booleans and numeric strings
to integers/numbers before dispatch. Fixes sequentialthinking and other
MCP tools that receive string values where the schema expects bool/int.
@alt-glitch alt-glitch added type/bug Something isn't working P3 Low — cosmetic, nice to have comp/tools Tool registry, model_tools, toolsets tool/mcp MCP client and OAuth labels Jun 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/tools Tool registry, model_tools, toolsets P3 Low — cosmetic, nice to have tool/mcp MCP client and OAuth type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants