feat: Orchestrator Subagents#3
Conversation
… params Both delegate_task call sites in run_agent.py hardcoded a subset of the schema params (goal, context, toolsets, tasks, max_iterations) and silently dropped acp_command and acp_args. Any future schema additions would hit the same drift. Replace both call sites with _dispatch_delegate_task() which forwards the entire validated schema from function_args. Also threads the conversation messages reference through for future context inheritance plumbing (M1).
Replace informal progress event strings (_thinking, tool.started, etc.) with a DelegateEvent enum. The callback normalises incoming legacy strings through _LEGACY_EVENT_MAP. External consumers (gateway SSE, ACP adapter, CLI) continue to receive legacy string names during the deprecation window — no consumer changes required.
Raise _DEFAULT_MAX_CONCURRENT_CHILDREN from 3 to 5. Add an absolute cap of 8 (aligned with OpenClaw's DEFAULT_SUBAGENT_MAX_CONCURRENT) — values above 8 from config or env are clamped with a warning log. Update schema description, cli-config.yaml.example, and delegation docs to reflect the new defaults.
- TestDispatchDelegateTask: verifies acp_command/acp_args forwarding and that _dispatch_delegate_task threads messages through - TestDelegateEventEnum: enum values, legacy map coverage, normalisation in the progress callback, unknown event rejection - TestConcurrencyDefaults: default=5, cap at 8, warning log on clamp, env var cap, within-range passthrough - Fix pre-existing test bug: test_task_index_prefix_in_batch_mode was passing tool_name as event_type (wrong signature) - Update test_constants assertion from 3 to 5
Extract duplicated cap-check logic from _get_max_concurrent_children into _clamp_concurrency helper. Fix stale "default 3" comment in the schema. Simplify test_acp_args_forwarded assertion.
… enum members Add test_progress_callback_normalises_thinking (both _thinking and reasoning.available), test_progress_callback_tool_completed_is_noop. Document that TASK_SPAWNED/COMPLETED/FAILED are reserved for M3. Rename test_progress_callback_normalises_legacy_events for clarity.
Grep found 4 more places still saying "up to 3" — schema description, tips.py, delegation-patterns.md, overview.md. Updated to match new default of 5 (max 8).
…tring The `messages` kwarg was threaded through `_dispatch_delegate_task` into `delegate_task` but never referenced inside the function body. Readers (including reviewers) kept assuming parent conversation history was being forwarded to child agents, which it is not. Remove the dead parameter and the test that asserted forwarding so the code matches the behavior. Also rephrase `_dispatch_delegate_task`'s docstring: the consolidation gives us a single call site, not automatic param forwarding — new schema fields still need to be added in one place.
delegation.default_toolsets was declared in cli.py's CLI_CONFIG default dict and documented in cli-config.yaml.example, but never read: none of tools/delegate_tool.py, _load_config(), or any call site ever looked it up. The live fallback is the DEFAULT_TOOLSETS module constant at tools/delegate_tool.py:101, which stays as-is. hermes_cli/config.py's DEFAULT_CONFIG["delegation"] already omits the key — this commit aligns cli.py with that. Adds a regression test in tests/hermes_cli/test_config_drift.py so a future refactor that re-adds the key without wiring it up to _load_config() fails loudly. Part of Initiative 2 / M0.5.
Matches the default-config removal in the preceding commit. default_toolsets was documented for users to set but was never actually read at runtime, so showing it in the example config and the delegation user guide was misleading. No deprecation note is added: the key was always a no-op, so users who copied it from the example continue to see no behavior change. Their config.yaml still parses; the key is just silently unused, same as before. Part of Initiative 2 / M0.5.
…config
The prior form of this test asserted on CLI_CONFIG["delegation"] after
importing cli, which only passed by accident of pytest-xdist worker
scheduling. cli._hermes_home is frozen at module import time (cli.py:76),
before the tests/conftest.py autouse HERMES_HOME-isolation fixture can
fire, so CLI_CONFIG ends up populated by deep-merging the contributor's
actual ~/.hermes/config.yaml over the defaults (cli.py:359-366). Any
contributor (like me) who still has the legacy key set in their own
config causes a false failure the moment another test file in the same
xdist worker imports cli at module level.
Asserting on the source of load_cli_config() instead sidesteps all of
that: the test now checks the defaults literal directly and is
independent of user config, HERMES_HOME, import order, and worker
scheduling.
Demonstrated failure mode before this fix:
pytest tests/hermes_cli/test_config_drift.py \
tests/hermes_cli/test_skills_hub.py -o addopts=""
-> FAILED (CLI_CONFIG["delegation"] contained "default_toolsets"
from the user's ~/.hermes/config.yaml)
Part of Initiative 2 / M0.5.
Introduces the configurable depth cap and global kill switch for the M3 orchestrator-role feature. No behavior change on defaults: max_spawn_depth=2 matches the legacy MAX_DEPTH=2 hard-coded value; orchestrator_enabled=True is a no-op until M3 commit 3 wires up role. Changes: - tools/delegate_tool.py: _MIN_SPAWN_DEPTH, _MAX_SPAWN_DEPTH_CAP, _get_max_spawn_depth() (clamps to [1, 3] with warning log, mirrors existing _clamp_concurrency pattern), _get_orchestrator_enabled() with bool/string YAML coercion. Depth guard at delegate_task now reads _get_max_spawn_depth() instead of MAX_DEPTH directly. MAX_DEPTH stays as the hardcoded default fallback and test import. - hermes_cli/config.py: DEFAULT_CONFIG["delegation"] seeds the two new keys. Not seeded in cli.py:CLI_CONFIG — follows the delegation.reasoning_effort precedent; cli.py's deep-merge picks up user overrides regardless. - tests/tools/test_delegate.py: TestMaxSpawnDepth (4 cases — default, clamp-low, clamp-high, invalid-falls-back).
Wires the 'role' param through schema -> delegate_task() -> dispatch ->
_build_child_agent -> stashed on child. No behavior change yet: Commit 3
adds the toolset re-add + role-aware prompt. Commit 2 verifies the
plumbing reaches the child and the schema description signals the
feature to the parent LLM.
Changes:
- tools/delegate_tool.py:
- Module-level _normalize_role(r) (near _clamp_concurrency), returns
'leaf' or 'orchestrator'; unknown strings warn and coerce to 'leaf'.
- DELEGATE_TASK_SCHEMA: new 'role' property at top level AND per-task
under tasks[].items. Top-level description text split into leaf vs
orchestrator capability statements so the parent LLM discovers that
role='orchestrator' unlocks nested delegation.
- delegate_task(): accepts role=Optional[str]; normalises top_role;
single-task dict at :738 now includes 'role' for batch/single
uniformity; child-build loop resolves effective_role = normalise(
t.get('role') or top_role) and forwards to _build_child_agent.
- _build_child_agent(): accepts role='leaf' kwarg; stashes
child._delegate_role for introspection (commit 3 will overwrite
with effective_role post-degrade).
- Registry handler lambda: forwards role=args.get('role') for the
Atropos dispatch path (dead for run_agent.py which short-circuits
to _dispatch_delegate_task).
- run_agent.py:_dispatch_delegate_task: forwards role through to
tools.delegate_tool.delegate_task.
- tests/tools/test_delegate.py:TestOrchestratorRoleSchema (4 cases —
default→leaf, explicit orchestrator stashed, nonsense→leaf+warning,
schema shape assertions for top-level and per-task 'role' properties).
The behavior change. Orchestrator children (role='orchestrator',
allowed by delegation.orchestrator_enabled and child_depth <
max_spawn_depth) retain the 'delegation' toolset and receive a
role-aware system prompt derived from OpenClaw's buildSubagentSystemPrompt
canSpawn branch. Leaf children are unchanged from pre-M3 behavior.
Changes:
- tools/delegate_tool.py:
- _build_child_agent: role resolution block at the top — computes
child_depth, max_spawn, orchestrator_ok (kill switch AND depth),
effective_role (single degrade point). Toolset re-add appends
"delegation" when effective_role == 'orchestrator' (runs after the
existing _strip_blocked_tools branches — unconditional on parent
toolset membership since orchestrator capability is granted by role
not inheritance; documented in test_intersection_preserves_delegation_bound).
child._delegate_role now stashes effective_role (post-degrade).
- _build_child_system_prompt: new role/max_spawn_depth/child_depth
kwargs; leaf prompt unchanged; orchestrator appends a spawning
block with WHEN/WHEN NOT to delegate guidance + literal depth
note that branches between "children MUST be leaves" (at the
floor) and "children can themselves be orchestrators" (below it).
Per-call role model means "can be", not "will be" — orchestrators
explicitly pass role='orchestrator' for nested delegation.
- _EXCLUDED_TOOLSET_NAMES: comment explaining the "delegation"
entry is an advertising exclusion, not a runtime block; the
role-driven re-add in _build_child_agent overrides it.
- tests/tools/test_delegate.py: TestOrchestratorRoleBehavior (9 cases)
- Role resolution: _keeps_delegation_at_depth_1,
_blocked_at_max_spawn_depth, _enabled_false_forces_leaf
- Prompt content: _leaf_does_not_mention_delegation,
_orchestrator_mentions_delegation_capability,
_at_depth_floor_says_children_are_leaves,
_below_floor_allows_more_nesting
- Batch + intersection: _batch_mode_per_task_role_override,
_intersection_preserves_delegation_bound (documents design choice)
Satisfies parent plan §7 item 3 acceptance: parent delegates to an
orchestrator child, which delegates to two leaf grandchildren; results
bubble up correctly.
Mocking strategy (plan §3.6 G3 sketch): single run_agent.AIAgent patch
with a side_effect factory that keys on the child's
ephemeral_system_prompt — orchestrator prompts contain the string
"Orchestrator Role" (see _build_child_system_prompt), leaves don't.
The orchestrator mock's run_conversation recursively calls
delegate_task with tasks=[{goal:...},{goal:...}] to spawn two leaves.
This keeps the whole test in one patch context and avoids depth-indexed
nesting patterns that are fragile.
Also updates test_constants to cover the two new config getters
(_get_max_spawn_depth, _get_orchestrator_enabled) and the two new
bound constants (_MIN_SPAWN_DEPTH=1, _MAX_SPAWN_DEPTH_CAP=3), as
called for by plan §4 Commit 4.
Assertions: MockAgent called exactly 3 times (1 orchestrator + 2
leaves); orchestrator got 'delegation' in its toolset and an
orchestrator prompt; both grandchildren did NOT get 'delegation' and
received leaf prompts.
- cli-config.yaml.example: two commented-out lines in the delegation
block advertising max_spawn_depth and orchestrator_enabled.
- website/docs/user-guide/features/delegation.md:
- Replace "Depth Limit" section with "Depth Limit and Nested
Orchestration": role='leaf' vs 'orchestrator' usage example,
max_spawn_depth bounds, orchestrator_enabled kill switch, and the
125-leaf cost warning for max_spawn_depth=3.
- "Key Properties" bullets updated to reflect opt-in nested
delegation and to split leaf/orchestrator capability statements.
- Configuration YAML example: two commented-out lines for the new
keys, matching the cli-config.yaml.example style.
Pure text; independently revertable.
|
…child callback
Addresses two bugs surfaced by codex in the M3 PR review
(.multi-agent/review-20260415-173416/reviews/codex.md):
1. HIGH — TASK_PROGRESS fall-through to TASK_TOOL_STARTED rendering.
_LEGACY_EVENT_MAP maps "subagent_progress" to DelegateEvent.TASK_PROGRESS,
but _build_child_progress_callback had no TASK_PROGRESS branch. Any
TASK_PROGRESS event fell through to the TASK_TOOL_STARTED display/batch
block, which treated the pre-batched summary string (in the tool_name
positional slot) as if it were a tool name — rendering
'├─ ⚡ 🔀 [1] terminal, file' and re-batching accumulated emoji prefixes
on each upward hop. This path is newly reachable in M3: nested
orchestrators relay subagent_progress from grandchildren upward via
this callback. Before M3 the toolset strip blocked the nesting that
produces this traffic.
Fix: explicit TASK_PROGRESS branch. Renders with a distinct 🔀 prefix
(no get_tool_emoji lookup) and relays upward as-is without re-batching
(the payload is already a batched summary).
2. MEDIUM — DelegateEvent enum values silently dropped. The callback
only did _LEGACY_EVENT_MAP.get(event_type), so cb(DelegateEvent.TASK_THINKING,
...) or cb('delegate.task_thinking', ...) produced no output. The enum
was added in M0 as the "new normalized event type" but no call path
accepted it.
Fix: normalize enum instances directly, then fall back to the legacy
map, then to DelegateEvent(str) construction for new-style "delegate.*"
strings.
Tests added (tests/tools/test_delegate.py, in TestDelegateEventEnum):
- test_progress_callback_accepts_enum_value_directly
- test_progress_callback_accepts_new_style_string
- test_progress_callback_task_progress_not_misrendered
Addresses the hermes reviewer finding in the M3 PR review (.multi-agent/review-20260415-173416/reviews/hermes.md §warnings #3): the Constraints section's "No nesting" bullet was stale against M3. Replaces with a "Nested delegation is opt-in" bullet that mirrors the wording already landed in website/docs/user-guide/features/delegation.md in the M3 docs commit (55aecde). Covers role='leaf' vs 'orchestrator', the max_spawn_depth bound, and the orchestrator_enabled kill switch — matching the feature doc's leaf-vs-orchestrator capability distinction.
|
Removes internal milestone references (M0, M0.5, M3) from code, tests, and docs in the delegation PR surface. Milestone tags were useful for tracking the rollout but carry no meaning for upstream readers or future maintainers — the feature and its rationale should stand on its own. Mechanical substitutions only — no behavior change, no docstring rewrites, no test renames. 102 tests still pass.
|
Two small follow-ups after the orchestrator-role PR review: 1. hermes_cli/config.py:DEFAULT_CONFIG["delegation"] now seeds max_concurrent_children=5 alongside max_spawn_depth and orchestrator_enabled. cli-config.yaml.example, docs, and the schema all advertise this key; the canonical default dict was the only surface that still omitted it. 2. website/docs/user-guide/features/delegation.md's "always blocked for subagents" bullet list was stale against the new orchestrator role: delegation is retained for role="orchestrator" children (see _build_child_agent re-add). Softened the heading from "always blocked" to "blocked", and rewrote the delegation bullet to point at the Depth Limit and Nested Orchestration section. Tests still pass (123).
|
…lls/ - Rename skill to touchdesigner-mcp (matches blender-mcp convention) - Move from skills/creative/ to optional-skills/creative/ - Fix duplicate pitfall numbering (#3 appeared twice) - Update SKILL.md cross-references for renumbered pitfalls - Update setup.sh path for new directory location
…ousResearch#13148) * feat(security): URL query param + userinfo + form body redaction Port from nearai/ironclaw#2529. Hermes already has broad value-shape coverage in agent/redact.py (30+ vendor prefixes, JWTs, DB connstrs, etc.) but missed three key-name-based patterns that catch opaque tokens without recognizable prefixes: 1. URL query params - OAuth callback codes (?code=...), access_token, refresh_token, signature, etc. These are opaque and won't match any prefix regex. Now redacted by parameter NAME. 2. URL userinfo (https://user:pass@host) - for non-DB schemes. DB schemes were already handled by _DB_CONNSTR_RE. 3. Form-urlencoded body (k=v pairs joined by ampersands) - conservative, only triggers on clean pure-form inputs with no other text. Sensitive key allowlist matches ironclaw's (exact case-insensitive, NOT substring - so token_count and session_id pass through). Tests: +20 new test cases across 3 test classes. All 75 redact tests pass; gateway/test_pii_redaction and tools/test_browser_secret_exfil also green. Known pre-existing limitation: _ENV_ASSIGN_RE greedy match swallows whole all-caps ENV-style names + trailing text when followed by another assignment. Left untouched here (out of scope); URL query redaction handles the lowercase case. * feat: replace kimi-k2.5 with kimi-k2.6 on OpenRouter and Nous Portal Update model catalogs for OpenRouter (fallback snapshot), Nous Portal, and NVIDIA NIM to reference moonshotai/kimi-k2.6. Add kimi-k2.6 to the fixed-temperature frozenset in auxiliary_client.py so the 0.6 contract is enforced on aggregator routings. Native Moonshot provider lists (kimi-coding, kimi-coding-cn, moonshot, opencode-zen, opencode-go) are unchanged — those use Moonshot's own model IDs which are unaffected.
…NousResearch#13021) Replaces the serial for-loop in tick() with ThreadPoolExecutor so all jobs due in a single tick run concurrently. A slow job no longer blocks others from executing, fixing silent job skipping (issue NousResearch#9086). Thread safety: - Session/delivery env vars migrated from os.environ to ContextVars (gateway/session_context.py) so parallel jobs can't clobber each other's delivery targets. Each thread gets its own copied context. - jobs.json read-modify-write cycles (advance_next_run, mark_job_run) protected by threading.Lock to prevent concurrent save clobber. - send_message_tool reads delivery vars via get_session_env() for ContextVar-aware resolution with os.environ fallback. Configuration: - cron.max_parallel_jobs in config.yaml (null = unbounded, 1 = serial) - HERMES_CRON_MAX_PARALLEL env var override Based on PR NousResearch#9169 by @VenomMoth1. Fixes NousResearch#9086
Extract 12 Codex Responses API format-conversion and normalization functions from run_agent.py into agent/codex_responses_adapter.py, following the existing pattern of anthropic_adapter.py and bedrock_adapter.py. run_agent.py: 12,550 → 11,865 lines (-685 lines) Functions moved: - _chat_content_to_responses_parts (multimodal content conversion) - _summarize_user_message_for_log (multimodal message logging) - _deterministic_call_id (cache-safe fallback IDs) - _split_responses_tool_id (composite ID splitting) - _derive_responses_function_call_id (fc_ prefix conversion) - _responses_tools (schema format conversion) - _chat_messages_to_responses_input (message format conversion) - _preflight_codex_input_items (input validation) - _preflight_codex_api_kwargs (API kwargs validation) - _extract_responses_message_text (response text extraction) - _extract_responses_reasoning_text (reasoning extraction) - _normalize_codex_response (full response normalization) All functions are stateless module-level functions. AIAgent methods remain as thin one-line wrappers. Both module-level helpers are re-exported from run_agent.py for backward compatibility with existing test imports. Includes multimodal inline image support (PR NousResearch#12969) that the original PR was missing. Based on PR NousResearch#12975 by @kshitijk4poor.
…in/QQ adapters Add dm_policy and group_policy to the WhatsApp adapter, bringing parity with WeCom/Weixin/QQ. Allows independent control of DM and group access: disable DMs entirely, allowlist specific senders/groups, or keep open. - dm_policy: open (default) | allowlist | disabled - group_policy: open (default) | allowlist | disabled - Config bridging for YAML → env vars - 22 tests covering all policy combinations Backward compatible — defaults preserve existing behavior. Cherry-picked from PR NousResearch#11597 by @MassiveMassimo. Dropped the run.py group auth bypass (would have skipped user auth for ALL platforms, not just WhatsApp).
…iders (NousResearch#13152) Add kimi-k2.6 as the top model in kimi-coding, kimi-coding-cn, and moonshot static provider lists (models.py, setup.py, main.py). kimi-k2.5 retained alongside it.
…providers Section 3 (user-defined endpoints) added the plain ep_name to seen_slugs but not the custom:-prefixed slug. Section 4 generates custom:<name> via custom_provider_slug() and checks seen_slugs — since the prefixed slug was missing, the same provider appeared twice in /model. Register custom_provider_slug(display_name).lower() in seen_slugs after Section 3 emits a provider, so Section 4's dedup correctly suppresses the duplicate. Closes NousResearch#12293. Co-authored-by: bennytimz <bennytimz@users.noreply.github.com>
…search#13157) Kimi's gateway selects the correct temperature server-side based on the active mode (thinking -> 1.0, non-thinking -> 0.6). Sending any temperature value — even the previously "correct" one — conflicts with gateway-managed defaults. Replaces the old approach of forcing specific temperature values (0.6 for non-thinking, 1.0 for thinking) with an OMIT_TEMPERATURE sentinel that tells all call sites to strip the temperature key from API kwargs entirely. Changes: - agent/auxiliary_client.py: OMIT_TEMPERATURE sentinel, _is_kimi_model() prefix check (covers all kimi-* models), _fixed_temperature_for_model() returns sentinel for kimi models. _build_call_kwargs() strips temp. - run_agent.py: _build_api_kwargs, flush_memories, and summary generation paths all handle the sentinel by popping/omitting temperature. - trajectory_compressor.py: _effective_temperature_for_model returns None for kimi (sentinel mapped), direct client calls use kwargs dict to conditionally include temperature. - mini_swe_runner.py: same sentinel handling via wrapper function. - 6 test files updated: all 'forces temperature X' assertions replaced with 'temperature not in kwargs' assertions. Net: -76 lines (171 added, 247 removed). Inspired by PR NousResearch#13137 (@kshitijk4poor).
…abled (NousResearch#13162) When createForumTopic fails with 'not a forum' in a private chat, the error now tells the user exactly what to do: enable Topics in the DM chat settings from the Telegram app. Also adds a Prerequisites callout to the docs explaining this client-side requirement before the config section.
…tree isolation Adds a _resolve_path() helper that reads TERMINAL_CWD and uses it as the base for relative path resolution. Applied to _check_sensitive_path, read_file_tool, _update_read_timestamp, and _check_file_staleness. Absolute paths and non-worktree sessions (no TERMINAL_CWD) are unaffected — falls back to os.getcwd(). Fixes NousResearch#12689.
# Conflicts: # tests/agent/test_subagent_progress.py # tools/delegate_tool.py
🚨 CRITICAL Supply Chain Risk DetectedThis PR contains a pattern that has been used in real supply chain attacks. A maintainer must review the flagged code carefully before merging. 🚨 CRITICAL: Install-hook file added or modifiedThese files can execute code during package installation or interpreter startup. Files: Scanner only fires on high-signal indicators: .pth files, base64+exec/eval combos, subprocess with encoded commands, or install-hook files. Low-signal warnings were removed intentionally — if you're seeing this comment, the finding is worth inspecting. |
Three independent reviews surfaced a handful of real bugs. Fixing all of them here: * **SIGTERM orphans hook subprocesses (codex #1).** The CLI only installed a SIGINT handler — SIGTERM (from ``kill``, ``timeout``, systemd stop, CI harnesses) skips atexit entirely and leaves every in-flight hook subprocess running as an orphan owned by init. Adds ``_async_pool_sigterm_handler`` which terminates tracked subprocess groups inline, then routes to ``sys.exit(128 + SIGTERM)``. Inline termination is required because ``ThreadPoolExecutor`` uses non-daemon threads: Python waits for every worker to return before running atexit, and workers block inside ``proc.communicate(timeout=spec.timeout)`` until the subprocess dies. Renamed ``_maybe_install_sigint_handler`` → ``_maybe_install_signal_handlers`` (with back-compat alias). Verified: ``kill -TERM`` on a hermes CLI running a 4 s ``sleep`` hook now exits in ~0.7 s with no orphan, was 4 s + orphan. * **Subprocess groups for reliable termination.** Hooks are now spawned with ``start_new_session=True`` so the subprocess is its own PGID leader. Shutdown / SIGINT / SIGTERM paths call ``os.killpg`` on the group instead of ``proc.terminate()`` — without this, a bash script's orphaned ``sleep`` child kept the parent stdout FD open and blocked ``proc.communicate`` for the full sleep duration. ``_terminate_group`` / ``_kill_group`` helpers fall back to plain ``terminate`` / ``kill`` on edge cases where ``getpgid`` fails (already-exited proc, non-POSIX). * **``hermes hooks test --no-wait`` blocks for full hook runtime (codex #2).** The flag advertised fire-and-forget but the CLI's ``ThreadPoolExecutor`` atexit ``pool.shutdown(wait=True)`` joined the worker anyway, which in turn waited for the subprocess. ``_cmd_test`` now polls briefly for ``_live_procs`` to fill (so the subprocess definitely spawned), then ``os._exit(0)`` — skipping atexit entirely. The subprocess keeps running under init because of ``start_new_session=True``. Verified: CLI exit dropped from 2.3 s to 76 ms for a 2-second hook, and the hook still writes its audit log 3 s later after the CLI is gone. * **Stale ``_child_role_for_batch`` test (claude #1 / hermes #2).** The test from commit 76d3ffd4 asserted the *old* helper field name — no code path sets it post-refactor (455c136f), so the test passed trivially without verifying anything. Fixed to assert ``_child_role`` (the real field) is stripped, and added an explanatory message so a future failure is easier to diagnose. Module-header docstring updated too. * **``submit()`` RuntimeError branch: stale-semaphore parity fix (claude #3).** Same pattern I already fixed in ``_on_async_future_done``, missed here: a concurrent ``_reset_async_pool`` between ``acquire`` and ``release`` would cause ``_async_sem_get()`` to lazy-create a fresh sem and over- release on it. Snapshot ``_async_sem_inst`` + swallow ``ValueError`` like the symmetric path. * **Shutdown race: proc registered after the snapshot (claude #4 / hermes #1).** Worker that got between ``subprocess.Popen()`` and ``_register_live_proc(proc)`` would miss the shutdown-sweep snapshot and block for the full ``spec.timeout``. After registering, the worker now checks ``_async_shutting_down`` and self-terminates its subprocess group. * **WARN log noise on SIGTERM'd children (claude #5).** Shutdown-induced exits (rc = -15 / -9) no longer spam a per-proc ``WARNING`` — demoted to ``DEBUG`` when ``_async_shutting_down`` is set. Both the atexit path and the signal handlers now set the flag before terminating, so a Ctrl-C or a ``kill -TERM`` with 10 running hooks emits zero warn lines instead of 10. Still outstanding (documented trade-offs, not fixed here): * Gateway shutdown blocks the event loop for up to ``grace_seconds`` (claude #2). Acknowledged as a follow-up candidate via ``loop.run_in_executor``. * ``_maybe_install_signal_handlers`` is still leading-underscore (claude NousResearch#6). Cosmetic; kept consistent with the rest of the module's private-by-convention API. All 101 hook tests still pass.
Summary
role="orchestrator"ondelegate_task— the child retains thedelegationtoolset and can parallelize its own workers. Bounded bydelegation.max_spawn_depth(1–3, default 2); flipdelegation.orchestrator_enabled: falseto disable globally.delegation.max_concurrent_children.delegation.default_toolsetswas documented but never read; removed from the example config and docs. No behavior change — existing configs still parse.Three delegation-related changes, each self-contained and reviewable in isolation.

Delegation plumbing cleanup
delegate_taskcall sites inrun_agent.pynow route through a single_dispatch_delegate_taskhelper. Fixes a silent drop ofacp_command/acp_argson the main agent loop — those fields were in the schema but never forwarded.DelegateEventenum with back-compat aliases for the existing progress event strings consumed by the gateway SSE, ACP adapter, and CLI spinner.max_concurrent_children: 3 → 5, with an absolute cap of 8 (aligned with OpenClaw'sDEFAULT_SUBAGENT_MAX_CONCURRENT). Values above the cap clamp with a warning log.Remove dead
default_toolsetsconfigdelegation.default_toolsetswas declared incli.py'sCLI_CONFIG, documented incli-config.yaml.example, and documented in the delegation feature docs, but never consulted at runtime._load_config()ignored it entirely; the live fallback is the hardcodedDEFAULT_TOOLSETSmodule constant intools/delegate_tool.py.tests/hermes_cli/test_config_drift.pyguards against re-introduction.Orchestrator role
role: "leaf" | "orchestrator"parameter ondelegate_task(top-level and per-task in batch mode). Leaf children are unchanged; orchestrator children retain thedelegationtoolset and receive a role-aware system prompt telling them they can spawn their own workers.delegation.max_spawn_depth(1-3, default 2) bounds the delegation tree — orchestrator requests are silently coerced to leaf when the child would exceed the depth cap.delegation.orchestrator_enabled(defaulttrue) is a global kill switch that forces every child to leaf regardless of the per-callrole.Follow-ups from review
TASK_PROGRESSevents relayed upward by nested orchestrators were falling through to theTASK_TOOL_STARTEDrenderer, which treated the batched summary string as if it were a tool name. Added an explicitTASK_PROGRESSbranch with pass-through relay and a distinct render. Reachable only once nesting is enabled._build_child_progress_callbacknow acceptsDelegateEventenum values and new-style"delegate.*"strings in addition to the legacy strings.website/docs/guides/delegation-patterns.mdupdated to matchfeatures/delegation.mdon nested-delegation opt-in.Type of Change
How to Test
pytest tests/tools/test_delegate.py tests/hermes_cli/test_config_drift.py -v— 102 passing.python -c "from tools.delegate_tool import DELEGATE_TASK_SCHEMA as S; assert S['parameters']['properties']['role']['enum'] == ['leaf', 'orchestrator']; print('schema OK')"python -c "from hermes_cli.config import DEFAULT_CONFIG as C; assert C['delegation']['max_spawn_depth'] == 2 and C['delegation']['orchestrator_enabled'] is True; print('defaults OK')"python -c "from tools.delegate_tool import MAX_DEPTH; assert MAX_DEPTH == 2; print('back-compat OK')"default_toolsetsintools/,cli*.yaml*, andwebsite/docs/**/*.md— only audit-only references remain (class variables on Atropos environments, local var inhermes_cli/dump.py, the regression test itself).Backward Compatibility
roledefaults to"leaf"— no existing caller changes behavior.MAX_DEPTH = 2constant remains as the hardcoded fallback and is still exported for tests.default_toolsetswas never functional, so removing it changes no observable behavior.Checklist
website/docs/user-guide/features/delegation.md,website/docs/guides/delegation-patterns.md)cli-config.yaml.exampleupdated for new config keysfeat(delegate):/refactor(delegate):/fix(delegate):/docs(delegate):/test(delegate):/chore(delegate):)