Skip to content

feat(tools): MCP profile bypass, lazy discovery, and delegate security hardening#44279

Open
davidgut1982 wants to merge 9 commits into
NousResearch:mainfrom
davidgut1982:split/mcp-profile-lazy-delegate
Open

feat(tools): MCP profile bypass, lazy discovery, and delegate security hardening#44279
davidgut1982 wants to merge 9 commits into
NousResearch:mainfrom
davidgut1982:split/mcp-profile-lazy-delegate

Conversation

@davidgut1982

Copy link
Copy Markdown
Contributor

What does this PR do?

This PR delivers three tightly-related improvements to how delegated subagents and no_mcp platforms handle MCP toolsets, plus the security hardening that makes the new bypass safe:

  1. Profile MCP bypass — When a parent delegates with a named agent_profile, the MCP toolsets declared by that profile bypass the parent's enabled_toolsets intersection. This fixes the case (feat: per-agent toolset restriction for orchestrator pattern (parent-scoped no_mcp) #32668 / fix(delegate): lazy MCP discovery + profile toolset bypass for no_mcp platforms #32727) where an orchestrator that restricts its own MCP servers could never hand a domain MCP toolset (e.g. mcp-fastmail, mcp-knowledge) to a child. Non-MCP toolsets the parent lacks are still dropped — the privilege-escalation boundary is preserved.

  2. Lazy MCP discovery for no_mcp platforms (Phase 2) — Platforms that start with no MCP context can discover and attach MCP toolsets on demand instead of eagerly loading every server at startup.

  3. Delegate security hardening — A series of fixes ensuring profile toolsets are authoritative and cannot be tampered with: batch-mode toolset injection is blocked, profile toolsets are deep-copied to prevent config aliasing/corruption across calls, and profile toolsets override inherit_mcp_toolsets so no parent MCP context bleeds into a child.

The approach is correct because the bypass is scoped to MCP toolsets resolved from a trusted, named profile — the LLM cannot request an arbitrary toolset and have it bypass intersection. Every non-profile and non-MCP path keeps the original intersection security boundary.

Note on scope: The related stringified-bool/int coercion fix for MCP tool args (coerce stringified booleans/integers in MCP tool args) is deliberately excluded from this PR — it is tracked separately in #37101.

Related Issue

Fixes #32668
Relates to #32727

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 🔒 Security fix
  • 📝 Documentation update
  • ✅ Tests (adding or improving test coverage)
  • ♻️ Refactor (no behavior change)
  • 🎯 New skill (bundled or hub)

Changes Made

  • tools/delegate_tool.py — profile MCP toolsets bypass parent intersection; profile toolsets are authoritative (block batch-mode injection); deep-copy profile toolsets to prevent config aliasing; profile toolsets override inherit_mcp_toolsets.
  • run_agent.py — wire profile_name through the delegate call chain so the bypass is applied at _build_child_agent.
  • tools/mcp_tool.py — lazy MCP discovery support for no_mcp platforms.
  • gateway/run.py, hermes_cli/main.py — lazy MCP discovery wiring (Phase 2) + Pyright type-error fixes in the _cfg config bridge block.
  • AGENTS.md — document profile_name MCP bypass behavior.
  • tests/tools/test_delegate_toolset_scope.py — regression coverage for the profile bypass and security hardening (intersection still enforced without a profile; non-MCP toolsets still dropped; config aliasing protected).
  • tests/tools/test_mcp_lazy_discovery.py — coverage for lazy MCP discovery.

How to Test

  1. From a checkout of this branch, run the scoped suites:
    pytest tests/tools/test_delegate_toolset_scope.py tests/tools/test_mcp_lazy_discovery.py -q
    → expect 34 passed, 0 failures (verified locally on Ubuntu, Python 3.13).
  2. Verify the security boundary: test_no_profile_mcp_toolsets_still_intersected proves an MCP toolset request without a profile is dropped; test_profile_non_mcp_toolsets_still_intersected proves a non-MCP toolset the parent lacks is still dropped even with a profile.
  3. Verify the bypass: test_profile_mcp_toolsets_bypass_parent_intersection proves profile-declared MCP toolsets pass through to a no_mcp parent's child.

Checklist

Code

Documentation & Housekeeping

  • I've updated relevant documentation (README, docs/, docstrings) — AGENTS.md + docstrings updated
  • I've updated cli-config.yaml.example if I added/changed config keys — N/A (no new config keys)
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — AGENTS.md updated
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guide — no platform-specific APIs used
  • I've updated tool descriptions/schemas if I changed tool behavior — delegate docstring updated

Screenshots / Logs

$ pytest tests/tools/test_delegate_toolset_scope.py tests/tools/test_mcp_lazy_discovery.py -q
..................................                                       [100%]
34 passed in 5.95s

davidgut1982 and others added 9 commits June 11, 2026 12:52
When delegate_task is called with a named agent_profile, MCP toolsets
declared in the profile now resolve from global mcp_servers config
rather than being filtered against the parent agent's loaded tools.

Previously, if the orchestrator restricted its own MCP context (via
no_mcp or simply not loading domain servers), child agents spawned with
profile toolsets like ["mcp-fastmail"] received empty tool lists
silently. The intersection logic in _build_child_agent() treated
parent-loaded tools as the upper bound for all children.

This fix adds a profile_name parameter to _build_child_agent(). When
set, _is_mcp_toolset_name() gates MCP toolsets through unconditionally;
non-MCP toolsets still require parent membership (security boundary
preserved). delegate_task() passes the resolved per-task profile name
through to _build_child_agent() at every call site.

Fixes the child-tool-loss failure mode described in issue NousResearch#32668.
Three regression tests added to test_delegate_toolset_scope.py.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…string

Add AGENTS.md note to the delegation section explaining that named
agent_profile toolsets bypass the parent intersection for MCP servers
(fix introduced in NousResearch#32668). Expand the _build_child_agent() docstring
to describe the profile_name parameter's semantics and the rationale
for the security-boundary split.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… toolsets bypass no_mcp parent intersection (NousResearch#32668/NousResearch#32727)

Under a no_mcp orchestrator (platform_toolsets: [no_mcp, delegation,
knowledge]), expanded_parent has zero MCP entries. The previous call site
hardcoded profile_name=None into _build_child_agent, so the MCP-bypass
branch (lines 972-976 of delegate_tool.py) never activated and fat
sub-agents received an empty toolset.

Fix:
- Add _load_agent_profiles() helper to read agent_profiles from the
  top-level config (not the delegation sub-key that _load_config() returns).
- Add profile: Optional[str] param to delegate_task(); when set, resolve
  the named profile's toolsets from agent_profiles and store as the
  effective toolsets for the child.
- Change the _build_child_agent call site from profile_name=None to
  profile_name=resolved_profile_name so the MCP-bypass branch activates
  for named profiles.
- Add "profile" to DELEGATE_TASK_SCHEMA so the model reliably emits the
  field; without a formal schema entry the LLM strips it at the provider
  API boundary.
- Update registry lambda and _dispatch_delegate_task in run_agent.py to
  forward profile= through both invocation paths.

Security: the bypass is scoped to MCP toolsets only, and only when a
named profile explicitly declares them in config. Non-MCP toolsets still
go through the parent intersection. Unknown profile names fall back
gracefully (warning + no bypass).

Tests added (test_delegate_toolset_scope.py):
- TestDelegateTaskProfileWiring: verifies delegate_task() calls
  _build_child_agent with profile_name='documents' and the profile's
  resolved toolsets. Key regression guard: FAILS against pre-fix code
  (profile_name=None hardcoded) and PASSES after the fix.
- TestDelegateTaskSchemaProfile: asserts 'profile' is a declared string
  property in DELEGATE_TASK_SCHEMA and is not in required[].
- TestProfileMcpBypassEndToEnd: direct _build_child_agent tests covering
  the post-fix bypass (profile_name set → mcp-nextcloud-files retained)
  and the security baseline (profile_name=None → MCP stripped).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…toolset injection (security)

Two privilege-escalation vectors closed:

1. Batch-mode injection (primary fix): the batch loop previously used
   `t.get("toolsets") or toolsets` for each task, so a model could name
   a valid profile (activating the mcp-* bypass in _build_child_agent)
   while supplying per-task toolsets that the profile never declared.
   Fix: when resolved_profile_name is set, all tasks in the batch receive
   profile_resolved_toolsets exclusively — per-task model-supplied toolsets
   are ignored.

2. Empty-profile bypass (secondary fix): a profile with no/empty toolsets
   still set resolved_profile_name, activating the mcp-* bypass with
   whatever caller-supplied toolsets were present. Fix: resolved_profile_name
   is only set when the profile declares a non-empty toolsets list; otherwise
   a warning is logged and the normal intersection path is used.

Tests added (tests/tools/test_delegate_toolset_scope.py):
- test_batch_injection_blocked_model_cannot_inject_evil_mcp_toolset: FAILS
  pre-fix (mcp-injected-evil present), PASSES post-fix
- test_single_task_profile_toolsets_unchanged: regression guard, PASSES both
- test_empty_profile_toolsets_bypass_not_activated: FAILS pre-fix
  (profile_name set), PASSES post-fix

Total: 16 → 19 tests, all green. tool_search: 63 pass, no collateral.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…l security hardening)

Assign list(profile_toolsets) instead of the bare reference at line ~2082
so in-place mutations of the working `toolsets` variable cannot corrupt
agent_profiles config for the process lifetime.  Also take an independent
list(toolsets) copy for profile_resolved_toolsets at line ~2110 (defence-
in-depth against future code inserted between the two assignments).

Adds TestProfileToolsetsAliasing::test_profile_toolsets_copy_prevents_config_corruption:
resolves a profile, mutates the child toolsets list returned to _build_child_agent,
and asserts the original profiles["documents"]["toolsets"] config list is unchanged.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rent MCP bleed (security hardening)

When a named agent_profile is used, the profile's declared toolsets are
authoritative. Previously _preserve_parent_mcp_toolsets ran unconditionally
when inherit_mcp_toolsets=True, silently appending every parent MCP toolset
absent from the child's list — including ones the profile never declared.

Guard the bleed path with `and not profile_name` so inheritance is skipped
whenever delegation resolves through a profile. Non-profile (ad-hoc)
delegation continues to inherit parent MCP toolsets unchanged.

The variable `profile_name` is already in scope at the call site (function
parameter, also used at line 972 for the existing MCP bypass guard).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When the active platform includes the no_mcp sentinel in its toolsets,
skip eager MCP server discovery at gateway/CLI startup. Discovery is
deferred until the first delegate_task() call that targets an MCP toolset,
using a thread-safe one-shot Event/Lock pattern.

This eliminates unnecessary MCP connection overhead for api_server platform
(orchestrator) while preserving full MCP access for child agents via the
Phase 1 profile_name bypass. cli/cron/telegram platforms are unaffected:
their toolsets lack no_mcp, so the gate evaluates False and eager discovery
runs exactly as before.

Changes:
- tools/mcp_tool.py: add mark_eager_discovery_skipped() + ensure_mcp_discovered()
  (one-shot, thread-safe, failure-tolerant lazy discovery trigger)
- gateway/run.py: add _active_platform_uses_no_mcp() helper; gate the eager
  discover_mcp_tools() call in start_gateway() on the platform no_mcp check
- hermes_cli/main.py: gate the inline CLI-startup discover_mcp_tools() call in
  _prepare_agent_startup() with the same no_mcp check (covers `gateway run`)
- tools/delegate_tool.py: call ensure_mcp_discovered() before building any
  child agent that requests MCP toolsets
- tests/tools/test_mcp_lazy_discovery.py: 12 tests covering the skip flag,
  no-op/once/idempotent/thread-safe/failure paths, and platform resolution

Part of fix/profile-mcp-toolset-bypass branch (stacked on Phase 1).
Resolves: NousResearch#32668

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pre-declare _cfg as Dict[str, Any] = {} so it is always bound and
Pyright knows the type is dict throughout the config bridge block.

Use an intermediate _expanded variable after _expand_env_vars() and
guard with isinstance(_, dict) so the re-assignment stays within the
declared dict type — _expand_env_vars has no return annotation and
Pyright infers a broad str | list | dict union.

Simplify the IPv4 network_cfg line: now that _cfg is always bound and
typed as dict, the old ('_cfg' in dir() else {}) guard is unnecessary.

Fixes Pyright errors at lines 863, 901, 917, 926, 930, 936, 978 that
were introduced by the Phase 2 lazy MCP discovery work.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…set_scope.py

Cherry-pick of 1342418 onto current main silently dropped the
`from unittest.mock import MagicMock, patch` line because both the
cherry-pick source and the HEAD file lacked it relative to the diff's
three-way base. The TestProfileMcpToolsetBypass class uses @patch and
MagicMock, so without this import collection fails with NameError.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@liuhao1024

Copy link
Copy Markdown
Contributor

Verification: security review — clean ✅

Reviewed the full diff (8 files, +1346/-8). Checked:

  1. MCP toolset bypass scope — profile-named MCP toolsets bypass parent intersection; non-MCP toolsets still intersected. Correct guard: if profile_name: [... if _is_mcp_toolset_name(t) or t in expanded_parent] vs else path.
  2. Privilege escalation prevention — profiles with empty/missing toolsets refuse to activate the bypass (fall back to intersection). Profile-resolved toolsets are enforced as authoritative for the entire batch, blocking model-supplied per-task toolset injection.
  3. Lazy discovery thread safety_lazy_discovery_done (threading.Event) + _lazy_discovery_lock (threading.Lock) ensure one-shot discovery. ensure_mcp_discovered() called unconditionally for any MCP-needing child build.
  4. _expand_env_vars type guard_expanded if isinstance(_expanded, dict) else {} prevents crash when config.yaml returns non-dict (e.g. bare string).
  5. _cfg scope fix'_cfg' in dir() replaced with module-level _cfg: Dict[str, Any] = {} initialization; eliminates the fragile scope check.

No issues found. Test coverage is comprehensive (856-line test file for profile bypass + 185-line test file for lazy discovery).

@alt-glitch alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have tool/mcp MCP client and OAuth tool/delegate Subagent delegation labels Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P3 Low — cosmetic, nice to have tool/delegate Subagent delegation tool/mcp MCP client and OAuth type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: per-agent toolset restriction for orchestrator pattern (parent-scoped no_mcp)

3 participants