feat: Orchestrator Subagents by pefontana · Pull Request #3 · pefontana/hermes-agent

pefontana · 2026-04-15T20:33:49Z

Summary

Subagents can now spawn their own subagents. Opt in with role="orchestrator" on delegate_task — the child retains the delegation toolset and can parallelize its own workers. Bounded by delegation.max_spawn_depth (1–3, default 2); flip delegation.orchestrator_enabled: false to disable globally.
Higher default parallelism. Batch mode now runs up to 5 concurrent subagents (was 3), hard cap 8. Tune via delegation.max_concurrent_children.
Dead config removed. delegation.default_toolsets was documented but never read; removed from the example config and docs. No behavior change — existing configs still parse.

Three delegation-related changes, each self-contained and reviewable in isolation.

Delegation plumbing cleanup

Both delegate_task call sites in run_agent.py now route through a single _dispatch_delegate_task helper. Fixes a silent drop of acp_command / acp_args on the main agent loop — those fields were in the schema but never forwarded.
DelegateEvent enum with back-compat aliases for the existing progress event strings consumed by the gateway SSE, ACP adapter, and CLI spinner.
Default max_concurrent_children: 3 → 5, with an absolute cap of 8 (aligned with OpenClaw's DEFAULT_SUBAGENT_MAX_CONCURRENT). Values above the cap clamp with a warning log.

Remove dead `default_toolsets` config

delegation.default_toolsets was declared in cli.py's CLI_CONFIG, documented in cli-config.yaml.example, and documented in the delegation feature docs, but never consulted at runtime. _load_config() ignored it entirely; the live fallback is the hardcoded DEFAULT_TOOLSETS module constant in tools/delegate_tool.py.
Removed from all three surfaces.
Regression test in tests/hermes_cli/test_config_drift.py guards against re-introduction.

Orchestrator role

New role: "leaf" | "orchestrator" parameter on delegate_task (top-level and per-task in batch mode). Leaf children are unchanged; orchestrator children retain the delegation toolset and receive a role-aware system prompt telling them they can spawn their own workers.
delegation.max_spawn_depth (1-3, default 2) bounds the delegation tree — orchestrator requests are silently coerced to leaf when the child would exceed the depth cap.
delegation.orchestrator_enabled (default true) is a global kill switch that forces every child to leaf regardless of the per-call role.
End-to-end test covers parent → orchestrator (depth 1) → two leaves (depth 2) nesting with full role/toolset/depth invariants.

Follow-ups from review

TASK_PROGRESS events relayed upward by nested orchestrators were falling through to the TASK_TOOL_STARTED renderer, which treated the batched summary string as if it were a tool name. Added an explicit TASK_PROGRESS branch with pass-through relay and a distinct render. Reachable only once nesting is enabled.
_build_child_progress_callback now accepts DelegateEvent enum values and new-style "delegate.*" strings in addition to the legacy strings.
website/docs/guides/delegation-patterns.md updated to match features/delegation.md on nested-delegation opt-in.

Type of Change

✨ New feature (non-breaking change that adds functionality)
♻️ Refactor (no behavior change)
📝 Documentation update
✅ Tests

How to Test

pytest tests/tools/test_delegate.py tests/hermes_cli/test_config_drift.py -v — 102 passing.
Schema: python -c "from tools.delegate_tool import DELEGATE_TASK_SCHEMA as S; assert S['parameters']['properties']['role']['enum'] == ['leaf', 'orchestrator']; print('schema OK')"
Defaults: python -c "from hermes_cli.config import DEFAULT_CONFIG as C; assert C['delegation']['max_spawn_depth'] == 2 and C['delegation']['orchestrator_enabled'] is True; print('defaults OK')"
Back-compat: python -c "from tools.delegate_tool import MAX_DEPTH; assert MAX_DEPTH == 2; print('back-compat OK')"
Docs: grep default_toolsets in tools/, cli*.yaml*, and website/docs/**/*.md — only audit-only references remain (class variables on Atropos environments, local var in hermes_cli/dump.py, the regression test itself).

Backward Compatibility

role defaults to "leaf" — no existing caller changes behavior.
MAX_DEPTH = 2 constant remains as the hardcoded fallback and is still exported for tests.
Progress event consumers get both old string names AND new enum values during the deprecation window.
default_toolsets was never functional, so removing it changes no observable behavior.

Checklist

Tests added (102 passing)
Docs updated (website/docs/user-guide/features/delegation.md, website/docs/guides/delegation-patterns.md)
cli-config.yaml.example updated for new config keys
Conventional Commits (feat(delegate): / refactor(delegate): / fix(delegate): / docs(delegate): / test(delegate): / chore(delegate):)
No unrelated changes

… params Both delegate_task call sites in run_agent.py hardcoded a subset of the schema params (goal, context, toolsets, tasks, max_iterations) and silently dropped acp_command and acp_args. Any future schema additions would hit the same drift. Replace both call sites with _dispatch_delegate_task() which forwards the entire validated schema from function_args. Also threads the conversation messages reference through for future context inheritance plumbing (M1).

Replace informal progress event strings (_thinking, tool.started, etc.) with a DelegateEvent enum. The callback normalises incoming legacy strings through _LEGACY_EVENT_MAP. External consumers (gateway SSE, ACP adapter, CLI) continue to receive legacy string names during the deprecation window — no consumer changes required.

Raise _DEFAULT_MAX_CONCURRENT_CHILDREN from 3 to 5. Add an absolute cap of 8 (aligned with OpenClaw's DEFAULT_SUBAGENT_MAX_CONCURRENT) — values above 8 from config or env are clamped with a warning log. Update schema description, cli-config.yaml.example, and delegation docs to reflect the new defaults.

- TestDispatchDelegateTask: verifies acp_command/acp_args forwarding and that _dispatch_delegate_task threads messages through - TestDelegateEventEnum: enum values, legacy map coverage, normalisation in the progress callback, unknown event rejection - TestConcurrencyDefaults: default=5, cap at 8, warning log on clamp, env var cap, within-range passthrough - Fix pre-existing test bug: test_task_index_prefix_in_batch_mode was passing tool_name as event_type (wrong signature) - Update test_constants assertion from 3 to 5

Extract duplicated cap-check logic from _get_max_concurrent_children into _clamp_concurrency helper. Fix stale "default 3" comment in the schema. Simplify test_acp_args_forwarded assertion.

… enum members Add test_progress_callback_normalises_thinking (both _thinking and reasoning.available), test_progress_callback_tool_completed_is_noop. Document that TASK_SPAWNED/COMPLETED/FAILED are reserved for M3. Rename test_progress_callback_normalises_legacy_events for clarity.

Grep found 4 more places still saying "up to 3" — schema description, tips.py, delegation-patterns.md, overview.md. Updated to match new default of 5 (max 8).

…tring The `messages` kwarg was threaded through `_dispatch_delegate_task` into `delegate_task` but never referenced inside the function body. Readers (including reviewers) kept assuming parent conversation history was being forwarded to child agents, which it is not. Remove the dead parameter and the test that asserted forwarding so the code matches the behavior. Also rephrase `_dispatch_delegate_task`'s docstring: the consolidation gives us a single call site, not automatic param forwarding — new schema fields still need to be added in one place.

delegation.default_toolsets was declared in cli.py's CLI_CONFIG default dict and documented in cli-config.yaml.example, but never read: none of tools/delegate_tool.py, _load_config(), or any call site ever looked it up. The live fallback is the DEFAULT_TOOLSETS module constant at tools/delegate_tool.py:101, which stays as-is. hermes_cli/config.py's DEFAULT_CONFIG["delegation"] already omits the key — this commit aligns cli.py with that. Adds a regression test in tests/hermes_cli/test_config_drift.py so a future refactor that re-adds the key without wiring it up to _load_config() fails loudly. Part of Initiative 2 / M0.5.

Matches the default-config removal in the preceding commit. default_toolsets was documented for users to set but was never actually read at runtime, so showing it in the example config and the delegation user guide was misleading. No deprecation note is added: the key was always a no-op, so users who copied it from the example continue to see no behavior change. Their config.yaml still parses; the key is just silently unused, same as before. Part of Initiative 2 / M0.5.

…config The prior form of this test asserted on CLI_CONFIG["delegation"] after importing cli, which only passed by accident of pytest-xdist worker scheduling. cli._hermes_home is frozen at module import time (cli.py:76), before the tests/conftest.py autouse HERMES_HOME-isolation fixture can fire, so CLI_CONFIG ends up populated by deep-merging the contributor's actual ~/.hermes/config.yaml over the defaults (cli.py:359-366). Any contributor (like me) who still has the legacy key set in their own config causes a false failure the moment another test file in the same xdist worker imports cli at module level. Asserting on the source of load_cli_config() instead sidesteps all of that: the test now checks the defaults literal directly and is independent of user config, HERMES_HOME, import order, and worker scheduling. Demonstrated failure mode before this fix: pytest tests/hermes_cli/test_config_drift.py \ tests/hermes_cli/test_skills_hub.py -o addopts="" -> FAILED (CLI_CONFIG["delegation"] contained "default_toolsets" from the user's ~/.hermes/config.yaml) Part of Initiative 2 / M0.5.

Introduces the configurable depth cap and global kill switch for the M3 orchestrator-role feature. No behavior change on defaults: max_spawn_depth=2 matches the legacy MAX_DEPTH=2 hard-coded value; orchestrator_enabled=True is a no-op until M3 commit 3 wires up role. Changes: - tools/delegate_tool.py: _MIN_SPAWN_DEPTH, _MAX_SPAWN_DEPTH_CAP, _get_max_spawn_depth() (clamps to [1, 3] with warning log, mirrors existing _clamp_concurrency pattern), _get_orchestrator_enabled() with bool/string YAML coercion. Depth guard at delegate_task now reads _get_max_spawn_depth() instead of MAX_DEPTH directly. MAX_DEPTH stays as the hardcoded default fallback and test import. - hermes_cli/config.py: DEFAULT_CONFIG["delegation"] seeds the two new keys. Not seeded in cli.py:CLI_CONFIG — follows the delegation.reasoning_effort precedent; cli.py's deep-merge picks up user overrides regardless. - tests/tools/test_delegate.py: TestMaxSpawnDepth (4 cases — default, clamp-low, clamp-high, invalid-falls-back).

Wires the 'role' param through schema -> delegate_task() -> dispatch -> _build_child_agent -> stashed on child. No behavior change yet: Commit 3 adds the toolset re-add + role-aware prompt. Commit 2 verifies the plumbing reaches the child and the schema description signals the feature to the parent LLM. Changes: - tools/delegate_tool.py: - Module-level _normalize_role(r) (near _clamp_concurrency), returns 'leaf' or 'orchestrator'; unknown strings warn and coerce to 'leaf'. - DELEGATE_TASK_SCHEMA: new 'role' property at top level AND per-task under tasks[].items. Top-level description text split into leaf vs orchestrator capability statements so the parent LLM discovers that role='orchestrator' unlocks nested delegation. - delegate_task(): accepts role=Optional[str]; normalises top_role; single-task dict at :738 now includes 'role' for batch/single uniformity; child-build loop resolves effective_role = normalise( t.get('role') or top_role) and forwards to _build_child_agent. - _build_child_agent(): accepts role='leaf' kwarg; stashes child._delegate_role for introspection (commit 3 will overwrite with effective_role post-degrade). - Registry handler lambda: forwards role=args.get('role') for the Atropos dispatch path (dead for run_agent.py which short-circuits to _dispatch_delegate_task). - run_agent.py:_dispatch_delegate_task: forwards role through to tools.delegate_tool.delegate_task. - tests/tools/test_delegate.py:TestOrchestratorRoleSchema (4 cases — default→leaf, explicit orchestrator stashed, nonsense→leaf+warning, schema shape assertions for top-level and per-task 'role' properties).

The behavior change. Orchestrator children (role='orchestrator', allowed by delegation.orchestrator_enabled and child_depth < max_spawn_depth) retain the 'delegation' toolset and receive a role-aware system prompt derived from OpenClaw's buildSubagentSystemPrompt canSpawn branch. Leaf children are unchanged from pre-M3 behavior. Changes: - tools/delegate_tool.py: - _build_child_agent: role resolution block at the top — computes child_depth, max_spawn, orchestrator_ok (kill switch AND depth), effective_role (single degrade point). Toolset re-add appends "delegation" when effective_role == 'orchestrator' (runs after the existing _strip_blocked_tools branches — unconditional on parent toolset membership since orchestrator capability is granted by role not inheritance; documented in test_intersection_preserves_delegation_bound). child._delegate_role now stashes effective_role (post-degrade). - _build_child_system_prompt: new role/max_spawn_depth/child_depth kwargs; leaf prompt unchanged; orchestrator appends a spawning block with WHEN/WHEN NOT to delegate guidance + literal depth note that branches between "children MUST be leaves" (at the floor) and "children can themselves be orchestrators" (below it). Per-call role model means "can be", not "will be" — orchestrators explicitly pass role='orchestrator' for nested delegation. - _EXCLUDED_TOOLSET_NAMES: comment explaining the "delegation" entry is an advertising exclusion, not a runtime block; the role-driven re-add in _build_child_agent overrides it. - tests/tools/test_delegate.py: TestOrchestratorRoleBehavior (9 cases) - Role resolution: _keeps_delegation_at_depth_1, _blocked_at_max_spawn_depth, _enabled_false_forces_leaf - Prompt content: _leaf_does_not_mention_delegation, _orchestrator_mentions_delegation_capability, _at_depth_floor_says_children_are_leaves, _below_floor_allows_more_nesting - Batch + intersection: _batch_mode_per_task_role_override, _intersection_preserves_delegation_bound (documents design choice)

Satisfies parent plan §7 item 3 acceptance: parent delegates to an orchestrator child, which delegates to two leaf grandchildren; results bubble up correctly. Mocking strategy (plan §3.6 G3 sketch): single run_agent.AIAgent patch with a side_effect factory that keys on the child's ephemeral_system_prompt — orchestrator prompts contain the string "Orchestrator Role" (see _build_child_system_prompt), leaves don't. The orchestrator mock's run_conversation recursively calls delegate_task with tasks=[{goal:...},{goal:...}] to spawn two leaves. This keeps the whole test in one patch context and avoids depth-indexed nesting patterns that are fragile. Also updates test_constants to cover the two new config getters (_get_max_spawn_depth, _get_orchestrator_enabled) and the two new bound constants (_MIN_SPAWN_DEPTH=1, _MAX_SPAWN_DEPTH_CAP=3), as called for by plan §4 Commit 4. Assertions: MockAgent called exactly 3 times (1 orchestrator + 2 leaves); orchestrator got 'delegation' in its toolset and an orchestrator prompt; both grandchildren did NOT get 'delegation' and received leaf prompts.

- cli-config.yaml.example: two commented-out lines in the delegation block advertising max_spawn_depth and orchestrator_enabled. - website/docs/user-guide/features/delegation.md: - Replace "Depth Limit" section with "Depth Limit and Nested Orchestration": role='leaf' vs 'orchestrator' usage example, max_spawn_depth bounds, orchestrator_enabled kill switch, and the 125-leaf cost warning for max_spawn_depth=3. - "Key Properties" bullets updated to reflect opt-in nested delegation and to split leaf/orchestrator capability statements. - Configuration YAML example: two commented-out lines for the new keys, matching the cli-config.yaml.example style. Pure text; independently revertable.

github-actions · 2026-04-15T20:34:04Z

⚠️ Supply Chain Risk Detected

This PR contains patterns commonly associated with supply chain attacks. This does not mean the PR is malicious — but these patterns require careful human review before merging.

⚠️ WARNING: Outbound network calls (POST/PUT)

Outbound POST/PUT requests in new code could be data exfiltration. Verify the destination URLs are legitimate.

Matches (first 10):

2577:+            resp = httpx.post(

Automated scan triggered by supply-chain-audit. If this is a false positive, a maintainer can approve after manual review.

…child callback Addresses two bugs surfaced by codex in the M3 PR review (.multi-agent/review-20260415-173416/reviews/codex.md): 1. HIGH — TASK_PROGRESS fall-through to TASK_TOOL_STARTED rendering. _LEGACY_EVENT_MAP maps "subagent_progress" to DelegateEvent.TASK_PROGRESS, but _build_child_progress_callback had no TASK_PROGRESS branch. Any TASK_PROGRESS event fell through to the TASK_TOOL_STARTED display/batch block, which treated the pre-batched summary string (in the tool_name positional slot) as if it were a tool name — rendering '├─ ⚡ 🔀 [1] terminal, file' and re-batching accumulated emoji prefixes on each upward hop. This path is newly reachable in M3: nested orchestrators relay subagent_progress from grandchildren upward via this callback. Before M3 the toolset strip blocked the nesting that produces this traffic. Fix: explicit TASK_PROGRESS branch. Renders with a distinct 🔀 prefix (no get_tool_emoji lookup) and relays upward as-is without re-batching (the payload is already a batched summary). 2. MEDIUM — DelegateEvent enum values silently dropped. The callback only did _LEGACY_EVENT_MAP.get(event_type), so cb(DelegateEvent.TASK_THINKING, ...) or cb('delegate.task_thinking', ...) produced no output. The enum was added in M0 as the "new normalized event type" but no call path accepted it. Fix: normalize enum instances directly, then fall back to the legacy map, then to DelegateEvent(str) construction for new-style "delegate.*" strings. Tests added (tests/tools/test_delegate.py, in TestDelegateEventEnum): - test_progress_callback_accepts_enum_value_directly - test_progress_callback_accepts_new_style_string - test_progress_callback_task_progress_not_misrendered

Addresses the hermes reviewer finding in the M3 PR review (.multi-agent/review-20260415-173416/reviews/hermes.md §warnings #3): the Constraints section's "No nesting" bullet was stale against M3. Replaces with a "Nested delegation is opt-in" bullet that mirrors the wording already landed in website/docs/user-guide/features/delegation.md in the M3 docs commit (55aecde). Covers role='leaf' vs 'orchestrator', the max_spawn_depth bound, and the orchestrator_enabled kill switch — matching the feature doc's leaf-vs-orchestrator capability distinction.

github-actions · 2026-04-15T20:50:19Z

⚠️ Supply Chain Risk Detected

This PR contains patterns commonly associated with supply chain attacks. This does not mean the PR is malicious — but these patterns require careful human review before merging.

⚠️ WARNING: Outbound network calls (POST/PUT)

Outbound POST/PUT requests in new code could be data exfiltration. Verify the destination URLs are legitimate.

Matches (first 10):

2577:+            resp = httpx.post(

Automated scan triggered by supply-chain-audit. If this is a false positive, a maintainer can approve after manual review.

Removes internal milestone references (M0, M0.5, M3) from code, tests, and docs in the delegation PR surface. Milestone tags were useful for tracking the rollout but carry no meaning for upstream readers or future maintainers — the feature and its rationale should stand on its own. Mechanical substitutions only — no behavior change, no docstring rewrites, no test renames. 102 tests still pass.

github-actions · 2026-04-15T21:09:39Z

⚠️ Supply Chain Risk Detected

This PR contains patterns commonly associated with supply chain attacks. This does not mean the PR is malicious — but these patterns require careful human review before merging.

⚠️ WARNING: Outbound network calls (POST/PUT)

Outbound POST/PUT requests in new code could be data exfiltration. Verify the destination URLs are legitimate.

Matches (first 10):

2577:+            resp = httpx.post(

Automated scan triggered by supply-chain-audit. If this is a false positive, a maintainer can approve after manual review.

Two small follow-ups after the orchestrator-role PR review: 1. hermes_cli/config.py:DEFAULT_CONFIG["delegation"] now seeds max_concurrent_children=5 alongside max_spawn_depth and orchestrator_enabled. cli-config.yaml.example, docs, and the schema all advertise this key; the canonical default dict was the only surface that still omitted it. 2. website/docs/user-guide/features/delegation.md's "always blocked for subagents" bullet list was stale against the new orchestrator role: delegation is retained for role="orchestrator" children (see _build_child_agent re-add). Softened the heading from "always blocked" to "blocked", and rewrote the delegation bullet to point at the Depth Limit and Nested Orchestration section. Tests still pass (123).

github-actions · 2026-04-16T18:02:54Z

⚠️ Supply Chain Risk Detected

This PR contains patterns commonly associated with supply chain attacks. This does not mean the PR is malicious — but these patterns require careful human review before merging.

⚠️ WARNING: Outbound network calls (POST/PUT)

Outbound POST/PUT requests in new code could be data exfiltration. Verify the destination URLs are legitimate.

Matches (first 10):

2577:+            resp = httpx.post(

Automated scan triggered by supply-chain-audit. If this is a false positive, a maintainer can approve after manual review.

…lls/ - Rename skill to touchdesigner-mcp (matches blender-mcp convention) - Move from skills/creative/ to optional-skills/creative/ - Fix duplicate pitfall numbering (#3 appeared twice) - Update SKILL.md cross-references for renumbered pitfalls - Update setup.sh path for new directory location

…ousResearch#13148) * feat(security): URL query param + userinfo + form body redaction Port from nearai/ironclaw#2529. Hermes already has broad value-shape coverage in agent/redact.py (30+ vendor prefixes, JWTs, DB connstrs, etc.) but missed three key-name-based patterns that catch opaque tokens without recognizable prefixes: 1. URL query params - OAuth callback codes (?code=...), access_token, refresh_token, signature, etc. These are opaque and won't match any prefix regex. Now redacted by parameter NAME. 2. URL userinfo (https://user:pass@host) - for non-DB schemes. DB schemes were already handled by _DB_CONNSTR_RE. 3. Form-urlencoded body (k=v pairs joined by ampersands) - conservative, only triggers on clean pure-form inputs with no other text. Sensitive key allowlist matches ironclaw's (exact case-insensitive, NOT substring - so token_count and session_id pass through). Tests: +20 new test cases across 3 test classes. All 75 redact tests pass; gateway/test_pii_redaction and tools/test_browser_secret_exfil also green. Known pre-existing limitation: _ENV_ASSIGN_RE greedy match swallows whole all-caps ENV-style names + trailing text when followed by another assignment. Left untouched here (out of scope); URL query redaction handles the lowercase case. * feat: replace kimi-k2.5 with kimi-k2.6 on OpenRouter and Nous Portal Update model catalogs for OpenRouter (fallback snapshot), Nous Portal, and NVIDIA NIM to reference moonshotai/kimi-k2.6. Add kimi-k2.6 to the fixed-temperature frozenset in auxiliary_client.py so the 0.6 contract is enforced on aggregator routings. Native Moonshot provider lists (kimi-coding, kimi-coding-cn, moonshot, opencode-zen, opencode-go) are unchanged — those use Moonshot's own model IDs which are unaffected.

…NousResearch#13021) Replaces the serial for-loop in tick() with ThreadPoolExecutor so all jobs due in a single tick run concurrently. A slow job no longer blocks others from executing, fixing silent job skipping (issue NousResearch#9086). Thread safety: - Session/delivery env vars migrated from os.environ to ContextVars (gateway/session_context.py) so parallel jobs can't clobber each other's delivery targets. Each thread gets its own copied context. - jobs.json read-modify-write cycles (advance_next_run, mark_job_run) protected by threading.Lock to prevent concurrent save clobber. - send_message_tool reads delivery vars via get_session_env() for ContextVar-aware resolution with os.environ fallback. Configuration: - cron.max_parallel_jobs in config.yaml (null = unbounded, 1 = serial) - HERMES_CRON_MAX_PARALLEL env var override Based on PR NousResearch#9169 by @VenomMoth1. Fixes NousResearch#9086

@kshitijk4poor

Extract 12 Codex Responses API format-conversion and normalization functions from run_agent.py into agent/codex_responses_adapter.py, following the existing pattern of anthropic_adapter.py and bedrock_adapter.py. run_agent.py: 12,550 → 11,865 lines (-685 lines) Functions moved: - _chat_content_to_responses_parts (multimodal content conversion) - _summarize_user_message_for_log (multimodal message logging) - _deterministic_call_id (cache-safe fallback IDs) - _split_responses_tool_id (composite ID splitting) - _derive_responses_function_call_id (fc_ prefix conversion) - _responses_tools (schema format conversion) - _chat_messages_to_responses_input (message format conversion) - _preflight_codex_input_items (input validation) - _preflight_codex_api_kwargs (API kwargs validation) - _extract_responses_message_text (response text extraction) - _extract_responses_reasoning_text (reasoning extraction) - _normalize_codex_response (full response normalization) All functions are stateless module-level functions. AIAgent methods remain as thin one-line wrappers. Both module-level helpers are re-exported from run_agent.py for backward compatibility with existing test imports. Includes multimodal inline image support (PR NousResearch#12969) that the original PR was missing. Based on PR NousResearch#12975 by @kshitijk4poor.

@MassiveMassimo

…in/QQ adapters Add dm_policy and group_policy to the WhatsApp adapter, bringing parity with WeCom/Weixin/QQ. Allows independent control of DM and group access: disable DMs entirely, allowlist specific senders/groups, or keep open. - dm_policy: open (default) | allowlist | disabled - group_policy: open (default) | allowlist | disabled - Config bridging for YAML → env vars - 22 tests covering all policy combinations Backward compatible — defaults preserve existing behavior. Cherry-picked from PR NousResearch#11597 by @MassiveMassimo. Dropped the run.py group auth bypass (would have skipped user auth for ALL platforms, not just WhatsApp).

…iders (NousResearch#13152) Add kimi-k2.6 as the top model in kimi-coding, kimi-coding-cn, and moonshot static provider lists (models.py, setup.py, main.py). kimi-k2.5 retained alongside it.

…refix

…providers Section 3 (user-defined endpoints) added the plain ep_name to seen_slugs but not the custom:-prefixed slug. Section 4 generates custom:<name> via custom_provider_slug() and checks seen_slugs — since the prefixed slug was missing, the same provider appeared twice in /model. Register custom_provider_slug(display_name).lower() in seen_slugs after Section 3 emits a provider, so Section 4's dedup correctly suppresses the duplicate. Closes NousResearch#12293. Co-authored-by: bennytimz <bennytimz@users.noreply.github.com>

@kshitijk4poor

…search#13157) Kimi's gateway selects the correct temperature server-side based on the active mode (thinking -> 1.0, non-thinking -> 0.6). Sending any temperature value — even the previously "correct" one — conflicts with gateway-managed defaults. Replaces the old approach of forcing specific temperature values (0.6 for non-thinking, 1.0 for thinking) with an OMIT_TEMPERATURE sentinel that tells all call sites to strip the temperature key from API kwargs entirely. Changes: - agent/auxiliary_client.py: OMIT_TEMPERATURE sentinel, _is_kimi_model() prefix check (covers all kimi-* models), _fixed_temperature_for_model() returns sentinel for kimi models. _build_call_kwargs() strips temp. - run_agent.py: _build_api_kwargs, flush_memories, and summary generation paths all handle the sentinel by popping/omitting temperature. - trajectory_compressor.py: _effective_temperature_for_model returns None for kimi (sentinel mapped), direct client calls use kwargs dict to conditionally include temperature. - mini_swe_runner.py: same sentinel handling via wrapper function. - 6 test files updated: all 'forces temperature X' assertions replaced with 'temperature not in kwargs' assertions. Net: -76 lines (171 added, 247 removed). Inspired by PR NousResearch#13137 (@kshitijk4poor).

…abled (NousResearch#13162) When createForumTopic fails with 'not a forum' in a private chat, the error now tells the user exactly what to do: enable Topics in the DM chat settings from the Telegram app. Also adds a Prerequisites callout to the docs explaining this client-side requirement before the config section.

…tree isolation Adds a _resolve_path() helper that reads TERMINAL_CWD and uses it as the base for relative path resolution. Applied to _check_sensitive_path, read_file_tool, _update_read_timestamp, and _check_file_staleness. Absolute paths and non-worktree sessions (no TERMINAL_CWD) are unaffected — falls back to os.getcwd(). Fixes NousResearch#12689.

…search#13169)

# Conflicts: # tests/agent/test_subagent_progress.py # tools/delegate_tool.py

github-actions · 2026-04-20T20:36:20Z

🚨 CRITICAL Supply Chain Risk Detected

This PR contains a pattern that has been used in real supply chain attacks. A maintainer must review the flagged code carefully before merging.

🚨 CRITICAL: Install-hook file added or modified

These files can execute code during package installation or interpreter startup.

Files:

hermes_cli/setup.py

Scanner only fires on high-signal indicators: .pth files, base64+exec/eval combos, subprocess with encoded commands, or install-hook files. Low-signal warnings were removed intentionally — if you're seeing this comment, the finding is worth inspecting.

Three independent reviews surfaced a handful of real bugs. Fixing all of them here: * **SIGTERM orphans hook subprocesses (codex #1).** The CLI only installed a SIGINT handler — SIGTERM (from ``kill``, ``timeout``, systemd stop, CI harnesses) skips atexit entirely and leaves every in-flight hook subprocess running as an orphan owned by init. Adds ``_async_pool_sigterm_handler`` which terminates tracked subprocess groups inline, then routes to ``sys.exit(128 + SIGTERM)``. Inline termination is required because ``ThreadPoolExecutor`` uses non-daemon threads: Python waits for every worker to return before running atexit, and workers block inside ``proc.communicate(timeout=spec.timeout)`` until the subprocess dies. Renamed ``_maybe_install_sigint_handler`` → ``_maybe_install_signal_handlers`` (with back-compat alias). Verified: ``kill -TERM`` on a hermes CLI running a 4 s ``sleep`` hook now exits in ~0.7 s with no orphan, was 4 s + orphan. * **Subprocess groups for reliable termination.** Hooks are now spawned with ``start_new_session=True`` so the subprocess is its own PGID leader. Shutdown / SIGINT / SIGTERM paths call ``os.killpg`` on the group instead of ``proc.terminate()`` — without this, a bash script's orphaned ``sleep`` child kept the parent stdout FD open and blocked ``proc.communicate`` for the full sleep duration. ``_terminate_group`` / ``_kill_group`` helpers fall back to plain ``terminate`` / ``kill`` on edge cases where ``getpgid`` fails (already-exited proc, non-POSIX). * **``hermes hooks test --no-wait`` blocks for full hook runtime (codex #2).** The flag advertised fire-and-forget but the CLI's ``ThreadPoolExecutor`` atexit ``pool.shutdown(wait=True)`` joined the worker anyway, which in turn waited for the subprocess. ``_cmd_test`` now polls briefly for ``_live_procs`` to fill (so the subprocess definitely spawned), then ``os._exit(0)`` — skipping atexit entirely. The subprocess keeps running under init because of ``start_new_session=True``. Verified: CLI exit dropped from 2.3 s to 76 ms for a 2-second hook, and the hook still writes its audit log 3 s later after the CLI is gone. * **Stale ``_child_role_for_batch`` test (claude #1 / hermes #2).** The test from commit 76d3ffd4 asserted the *old* helper field name — no code path sets it post-refactor (455c136f), so the test passed trivially without verifying anything. Fixed to assert ``_child_role`` (the real field) is stripped, and added an explanatory message so a future failure is easier to diagnose. Module-header docstring updated too. * **``submit()`` RuntimeError branch: stale-semaphore parity fix (claude #3).** Same pattern I already fixed in ``_on_async_future_done``, missed here: a concurrent ``_reset_async_pool`` between ``acquire`` and ``release`` would cause ``_async_sem_get()`` to lazy-create a fresh sem and over- release on it. Snapshot ``_async_sem_inst`` + swallow ``ValueError`` like the symmetric path. * **Shutdown race: proc registered after the snapshot (claude #4 / hermes #1).** Worker that got between ``subprocess.Popen()`` and ``_register_live_proc(proc)`` would miss the shutdown-sweep snapshot and block for the full ``spec.timeout``. After registering, the worker now checks ``_async_shutting_down`` and self-terminates its subprocess group. * **WARN log noise on SIGTERM'd children (claude #5).** Shutdown-induced exits (rc = -15 / -9) no longer spam a per-proc ``WARNING`` — demoted to ``DEBUG`` when ``_async_shutting_down`` is set. Both the atexit path and the signal handlers now set the flag before terminating, so a Ctrl-C or a ``kill -TERM`` with 10 running hooks emits zero warn lines instead of 10. Still outstanding (documented trade-offs, not fixed here): * Gateway shutdown blocks the event loop for up to ``grace_seconds`` (claude #2). Acknowledged as a follow-up candidate via ``loop.run_in_executor``. * ``_maybe_install_signal_handlers`` is still leading-underscore (claude NousResearch#6). Cosmetic; kept consistent with the rest of the module's private-by-convention API. All 101 hook tests still pass.

pefontana added 17 commits April 14, 2026 15:39

refactor(delegate): extract _clamp_concurrency, fix stale comment

51453fa

Extract duplicated cap-check logic from _get_max_concurrent_children into _clamp_concurrency helper. Fix stale "default 3" comment in the schema. Simplify test_acp_args_forwarded assertion.

docs(delegate): fix remaining stale "3 concurrent" references

220cb71

Grep found 4 more places still saying "up to 3" — schema description, tips.py, delegation-patterns.md, overview.md. Updated to match new default of 5 (max 8).

Merge branch 'NousResearch:main' into delegate-dispatch-cleanup

61dbd60

pefontana added 2 commits April 15, 2026 17:48

pefontana changed the title ~~Orchestrator role~~ Orchestrator Subagents Apr 16, 2026

pefontana changed the title ~~Orchestrator Subagents~~ feat: Orchestrator Subagents Apr 16, 2026

teknium1 added 2 commits April 20, 2026 11:49

kshitijk4poor and others added 12 commits April 20, 2026 11:53

chore: add MassiveMassimo to AUTHOR_MAP

f01e651

feat: add kimi-k2.6 to kimi-coding, kimi-coding-cn, and moonshot prov…

6d58ec7

…iders (NousResearch#13152) Add kimi-k2.6 as the top model in kimi-coding, kimi-coding-cn, and moonshot static provider lists (models.py, setup.py, main.py). kimi-k2.5 retained alongside it.

fix(tools): reap orphaned cloud browser daemons with hermes session p…

89070b8

…refix

test: add _resolve_path tests + AUTHOR_MAP entry for aniruddhaadak80

5a2118a

feat: add moonshotai/Kimi-K2.6 to HuggingFace provider models (NousRe…

cc1afef

…search#13169)

Merge remote-tracking branch 'origin/main' into orchestrator-role

0c23e11

# Conflicts: # tests/agent/test_subagent_progress.py # tools/delegate_tool.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Orchestrator Subagents#3

feat: Orchestrator Subagents#3
pefontana wants to merge 35 commits into
mainfrom
orchestrator-role

pefontana commented Apr 15, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 15, 2026

Uh oh!

github-actions Bot commented Apr 15, 2026

Uh oh!

github-actions Bot commented Apr 15, 2026

Uh oh!

github-actions Bot commented Apr 16, 2026

Uh oh!

github-actions Bot commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

pefontana commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Delegation plumbing cleanup

Remove dead default_toolsets config

Orchestrator role

Follow-ups from review

Type of Change

How to Test

Backward Compatibility

Checklist

Uh oh!

github-actions Bot commented Apr 15, 2026

⚠️ Supply Chain Risk Detected

⚠️ WARNING: Outbound network calls (POST/PUT)

Uh oh!

github-actions Bot commented Apr 15, 2026

⚠️ Supply Chain Risk Detected

⚠️ WARNING: Outbound network calls (POST/PUT)

Uh oh!

github-actions Bot commented Apr 15, 2026

⚠️ Supply Chain Risk Detected

⚠️ WARNING: Outbound network calls (POST/PUT)

Uh oh!

github-actions Bot commented Apr 16, 2026

⚠️ Supply Chain Risk Detected

⚠️ WARNING: Outbound network calls (POST/PUT)

Uh oh!

github-actions Bot commented Apr 20, 2026

🚨 CRITICAL Supply Chain Risk Detected

🚨 CRITICAL: Install-hook file added or modified

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

pefontana commented Apr 15, 2026 •

edited

Loading

Remove dead `default_toolsets` config