Shell Hooks by pefontana · Pull Request #4 · pefontana/hermes-agent

pefontana · 2026-04-17T20:48:01Z

Shell Hooks let you intercept agent lifecycle events with a shell script written in whatever language you want — bash, python, go, anything that reads stdin and writes stdout. Drop a script in ~/.hermes/agent-hooks/, list it in cli-config.yaml, and it fires on every matching event with a JSON payload on stdin.

Three things you can do now that were awkward or impossible before:

Block dangerous tool calls from outside Python. Return {"decision": "block", "reason": "..."} from a pre_tool_call hook and the agent refuses with your message. Matcher-regex-scoped, so you can gate just terminal or web_search without touching anything else.
Inject dynamic context into every LLM turn. A pre_llm_call hook returning {"context": "..."} prepends its output to the next user message. Great for current time, git status, pwd, MOTD, or anything else you want the model to know without typing it.
Observe delegation. New subagent_stop event fires once per delegate_task child with parent_session_id, child_role, child_summary, child_status, duration_ms. Log, audit, forward to a dashboard — without patching Hermes.

Example — block `rm -rf` in bash

# ~/.hermes/agent-hooks/block-rm-rf.sh
#!/usr/bin/env bash
payload=$(cat)
if echo "$payload" | grep -q "rm -rf"; then
  printf '{"decision": "block", "reason": "rm -rf is blocked by policy"}\n'
else
  printf '{}\n'
fi

# ~/.hermes/config.yaml
hooks:
  pre_tool_call:
    - matcher: "terminal"
      command: "~/.hermes/agent-hooks/block-rm-rf.sh"
      timeout: 5
  pre_llm_call:
    - command: "python3 ~/.hermes/agent-hooks/inject-context.py"
  subagent_stop:
    - command: "~/.hermes/agent-hooks/log-orchestration.sh"

$ hermes --accept-hooks chat -q "clean the repo with rm -rf"
Agent refused: rm -rf is blocked by policy

Supported events

pre_tool_call (can veto), post_tool_call, pre_llm_call (can inject context), post_llm_call, pre_api_request, post_api_request, on_session_start, on_session_end, on_session_finalize, on_session_reset, subagent_stop, transform_terminal_output, transform_tool_result — same set as Python plugin hooks, registered through the same dispatcher.

Security model

First-use consent prompt on the TTY; decisions persist to ~/.hermes/shell-hooks-allowlist.json at mode 0o600. Non-TTY runs (gateway, cron, CI) need one of three explicit opt-ins:

--accept-hooks on the CLI (hermes --accept-hooks gateway run, hermes gateway run --accept-hooks, hermes --accept-hooks cron tick, ...).
HERMES_ACCEPT_HOOKS=1 environment variable.
hooks_auto_accept: true in cli-config.yaml.

Subprocess execution is shell=False + shlex.split — no shell-injection vector.
Allowlist writes are atomic (tempfile.mkstemp + os.replace) and cross-process-safe (fcntl.flock on a sibling .lock file).
Timeouts default to 60s, capped at 300s. Subprocess failures (missing binary, timeout, bad JSON) log a warning but never crash the agent loop.
Script mtime is captured at approval time; hermes hooks doctor flags drift so you can spot post-approval edits.
Registration is gated to subcommands that actually run the agent (chat, gateway run, mcp serve, acp, cron run, cron tick, rl). Management commands like hermes gateway status, cron list, or mcp add never trigger plugin discovery or consent prompts.

`hermes hooks` CLI

Command	Purpose
`hermes hooks list`	Configured hooks with matcher, timeout, and consent status
`hermes hooks test <event> [--for-tool X] [--payload-file F]`	Fire every matching hook against a synthetic payload and print the parsed response
`hermes hooks doctor`	For each configured hook: check exec bit, allowlist, mtime drift, JSON output validity, execution time
`hermes hooks revoke <command>`	Drop every allowlist entry matching `<command>`

Wire protocol (for hook authors)

stdin (JSON, same top-level shape for every event):

{
  "hook_event_name": "pre_tool_call",
  "tool_name":       "terminal",
  "tool_input":      {"command": "rm -rf /"},
  "session_id":      "sess_abc123",
  "cwd":             "/home/user/project",
  "extra":           {...}
}

stdout (JSON, optional — anything else is ignored):

Block pre_tool_call: {"decision": "block", "reason": "..."} (Claude-Code style) or {"action": "block", "message": "..."} (Hermes canonical). Both shapes accepted and normalised.
Inject context on pre_llm_call: {"context": "..."}.
Silent no-op: empty.

Full spec + worked examples in website/docs/user-guide/features/hooks.md.

Interpreter-prefixed commands

python3 ~/hook.py, bash ~/hook.sh, and /usr/bin/env python3 ~/hook.py are all first-class — the bridge resolves the actual script path, so hermes hooks doctor reports the right executability and mtime drift instead of checking python3 on $PATH.

Type of Change

✨ New feature (non-breaking change that adds functionality)
📝 Documentation update
✅ Tests

How to Test

pytest tests/agent/test_shell_hooks.py tests/agent/test_shell_hooks_consent.py tests/agent/test_subagent_stop_hook.py tests/hermes_cli/test_hooks_cli.py -v — 80+ passing.

Drop the block-rm-rf.sh + config snippet above into ~/.hermes/, then:

$ hermes --accept-hooks hooks doctor
All shell hooks look healthy.
$ hermes --accept-hooks chat -q "clean temp with rm -rf /tmp/x"

Agent should surface "rm -rf is blocked by policy" and refuse.

hermes hooks list / hermes hooks revoke ~/.hermes/agent-hooks/block-rm-rf.sh for the round-trip.

Backward Compatibility

Nothing registers unless the user adds a hooks: block to cli-config.yaml — existing installs are unchanged.
Existing Python plugin hooks keep working; shell hooks are additive callbacks on the same invoke_hook() dispatcher, with Python plugins registered first so their block decisions win ties.
subagent_stop is a net-new entry in VALID_HOOKS; nothing else changed.

Checklist

Tests added (80+ passing)
Docs updated (website/docs/user-guide/features/hooks.md)
cli-config.yaml.example updated with a commented hooks: example
Conventional Commits
No unrelated changes (all commits are shell-hook scoped)
Tested on macOS 15 (Darwin 25.4.0)

New event fired once per child after delegate_task completes. This commit reserves the event name in the dispatcher vocabulary; the actual firing site in delegate_tool lands separately.

After every child agent completes, invoke_hook("subagent_stop", ...) fires once with parent_session_id, child_role, child_summary, child_status, and duration_ms. Firing happens on the parent thread after as_completed() drains the futures, not on the worker threads that ran the children. Hook callbacks never see concurrent invocation. child_role reads child._delegate_role via a defensive getattr so this works without depending on the orchestrator-role work. The value is stashed on each entry dict before child.close() runs in the finally block, then stripped again before delegate_task serialises its result to JSON.

Shell-script hooks declared in the hooks: block of cli-config.yaml. Each entry becomes a plugin-hook callback via agent.shell_hooks.register_from_config(), so every existing invoke_hook() site dispatches to the configured script with no additional wiring on the caller side. Wire protocol: JSON payload on stdin, optional JSON response on stdout. For pre_tool_call the Claude-Code-style {"decision": "block", "reason": str} payload is translated to the canonical Hermes {"action": "block", "message": str} shape so that get_pre_tool_call_block_message() surfaces the block. This is the one correctness invariant the test suite gates on. First-use consent prompts on a TTY and persists approvals to ~/.hermes/shell-hooks-allowlist.json with the script mtime captured at approval time (used later by `hermes hooks doctor` for drift detection). Non-TTY runs need --accept-hooks, HERMES_ACCEPT_HOOKS=1, or hooks_auto_accept: true. Subprocess execution uses shlex.split + shell=False, default 60s timeout, capped at 300s. Registration is idempotent per (event, command) pair so CLI and gateway startup can both call register_from_config() in the same process.

hermes_cli/main.py calls register_from_config() at startup, gated to commands that actually run the agent (chat, gateway, acp, mcp, cron, rl). Introspection-only commands skip registration so `hermes hooks list` never triggers a consent prompt for a hook the user is still inspecting. gateway/run.py calls register_from_config() right before self.hooks.discover_and_load() at the existing startup point. Adds --accept-hooks on the top-level parser and the chat subparser, following the --worktree pattern. Adds the hermes hooks subcommand group: - list configured hooks with matcher, timeout, consent - test <event> fire matching hooks against a synthetic payload - revoke <cmd> drop allowlist entries - doctor check exec bit, allowlist, mtime drift, JSON output Also adds a commented hooks: example block to cli-config.yaml.example.

New Shell Hooks section covers the concept, a three-way comparison table (shell vs plugin vs gateway), the JSON wire protocol with the Claude-Code-to-canonical block translation, four worked examples (auto-format, block-rm-rf, inject-cwd-context, log-orchestration), the consent model, and the hermes hooks CLI. New subagent_stop entry in the plugin hooks reference with the callback signature, payload fields, and a logging example. Also reconciles the two places that claimed pre_tool_call return values are always ignored — get_pre_tool_call_block_message actually acts on {"action": "block", "message": str}. Intro paragraph, quick-reference table, and pre_tool_call detail section updated.

github-actions · 2026-04-17T20:48:13Z

⚠️ Supply Chain Risk Detected

This PR contains patterns commonly associated with supply chain attacks. This does not mean the PR is malicious — but these patterns require careful human review before merging.

⚠️ WARNING: Dependency manifest files modified

Changes to dependency files can introduce new packages or change version pins. Verify all dependency changes are intentional and from trusted sources.

Files:

pyproject.toml

Automated scan triggered by supply-chain-audit. If this is a false positive, a maintainer can approve after manual review.

Unify the subprocess invocation between the live callback path and the `hermes hooks test` diagnostic helper. Both now go through a single _spawn() function that returns a consistent result dict (returncode, stdout, stderr, timed_out, elapsed_seconds, error), cutting the two copies of timeout / FileNotFoundError / PermissionError handling. Drop _safe_extras and _safe_cwd. The outer json.dumps(default=str) call already renders every unserialisable value as its str() form, so the per-value try/except dance in _safe_extras was pure overhead. Replace the ad-hoc _nearest_event/_similarity pair with difflib.get_close_matches — same behaviour in one stdlib call instead of 40 lines. Trim _record_approval: reuse script_mtime_iso() for the mtime lookup and extract _utc_now_iso() so both the approved_at and mtime fields share a single formatter. Net: shell_hooks.py drops from 807 to 718 lines with the same test surface green (72 tests).

github-actions · 2026-04-17T20:57:25Z

⚠️ Supply Chain Risk Detected

This PR contains patterns commonly associated with supply chain attacks. This does not mean the PR is malicious — but these patterns require careful human review before merging.

⚠️ WARNING: Dependency manifest files modified

Changes to dependency files can introduce new packages or change version pins. Verify all dependency changes are intentional and from trusted sources.

Files:

pyproject.toml

Automated scan triggered by supply-chain-audit. If this is a false positive, a maintainer can approve after manual review.

…tomic allowlist Five issues surfaced by reviewers of PR #4 (codex, claude): Shell-hook precedence in the gateway startup path was reversed. CLI called discover_plugins() before register_from_config(); gateway relied on the discover_plugins() side-effect inside model_tools.py, but gateway lazy-imports run_agent per-request so that side-effect never fires at startup. Add an explicit discover_plugins() call before register_from_config() in gateway/run.py. Shell-hook dedupe was keyed on (event, command) alone, so reusing the same script under one event with two matchers silently dropped the second entry. Include matcher in the identity triple. Regression test for the same-command-different-matcher case. `hermes hooks test` and `hermes hooks doctor` built stdin directly from _DEFAULT_PAYLOADS, bypassing _serialize_payload. The synthetic shape was missing cwd and session_id top-level keys, and buried parent_session_id under extra instead of lifting it to session_id like production does. A script validated via `hermes hooks test` could silently break in production. Rewrite _DEFAULT_PAYLOADS as kwargs dicts and route run_once through _serialize_payload so the two paths produce identical JSON. Regression test asserts the top-level key set. discover_plugins() at CLI startup was running for every subcommand, not just agent-running ones. Pre-PR behaviour was lazy discovery via model_tools import — commands like hermes --help / doctor / tools / hooks list never paid the cost and never triggered plugin register() side effects. Move the discover_plugins() call inside the _AGENT_COMMANDS gate so that stays true. Allowlist writes were a naïve write_text, risking truncated JSON if two CLI processes race or one dies mid-write. Use tmp + os.replace. Harden tests/agent/test_subagent_stop_hook.py to run without openai installed. delegate_task calls _build_child_agent which imports run_agent → openai even when _run_single_child is patched. Add an autouse fixture that replaces _build_child_agent with a MagicMock factory.

github-actions · 2026-04-17T22:23:36Z

⚠️ Supply Chain Risk Detected

This PR contains patterns commonly associated with supply chain attacks. This does not mean the PR is malicious — but these patterns require careful human review before merging.

⚠️ WARNING: Dependency manifest files modified

Changes to dependency files can introduce new packages or change version pins. Verify all dependency changes are intentional and from trusted sources.

Files:

pyproject.toml

Automated scan triggered by supply-chain-audit. If this is a false positive, a maintainer can approve after manual review.

The _AGENT_COMMANDS gate included "cron" for every cron subcommand, so `hermes cron list`, `cron status`, `cron create`, and the rest triggered discover_plugins() + register_from_config() even though none of them run the agent loop. Users hit the first-use consent prompt on a hook they were still inspecting, or paid plugin-discovery cost on a read-only CRUD command. Narrow the cron carve-out to the two subcommands that actually run jobs: `cron run` and `cron tick`. Every other cron subcommand falls through the gate and stays plugin-free, matching pre-PR behaviour for introspection commands.

save_allowlist() wrote to a fixed `shell-hooks-allowlist.json.tmp` path and _record_approval() did read-modify-write without any lock, so two concurrent CLI / gateway processes calling the consent path raced two ways: os.replace() hit ENOENT when one writer's temp file got moved out from under another, and the read-modify-write clobbered entries because whichever writer finished last overwrote the other's snapshot. A 50-way stress run kept only 2/50 approvals before this change. save_allowlist() now uses tempfile.mkstemp() for a per-process unique temp file inside ~/.hermes and cleans it up on failure, so concurrent writers never collide on the staging path. _record_approval() and revoke() now run their read-modify-write inside _locked_update_approvals(), which holds an exclusive fcntl.flock on ~/.hermes/shell-hooks-allowlist.json.lock for the duration of the update. Same 50-way stress run now preserves all 50 approvals. The non-POSIX fallback uses a dedicated _allowlist_write_lock rather than _registered_lock, because register_from_config() already holds _registered_lock when it triggers _record_approval() via the consent prompt path — reusing it would self-deadlock on threading.Lock (non-reentrant). Regression test confirms: fails with a 3 s timeout against the shared-lock variant, passes against the dedicated lock. Three new tests under TestAllowlistConcurrency: - test_parallel_record_approval_does_not_lose_entries — 32 threads race through _record_approval behind a Barrier, asserts all 32 entries land. - test_save_allowlist_uses_unique_tmp_paths — spies on tempfile.mkstemp and asserts successive save_allowlist() calls pick distinct temp paths. - test_non_posix_fallback_does_not_self_deadlock — monkeypatches fcntl to None, runs _record_approval while holding _registered_lock in a daemon thread, bounds the wait via threading.Event.wait(timeout=3). Side effect: tempfile.mkstemp defaults to 0o600, so the allowlist file now lands with user-only permissions instead of whatever the default umask would have given it.

`hermes hooks doctor` ran the synthetic-payload smoke test against every configured hook regardless of allowlist status. That directly undermined the workflow documented in hooks.md — "re-run `hermes hooks doctor` after you pull a shared config to spot newly-added hooks *before they register*" — because doctor itself executed every listed script before reporting the "✗ not allowlisted" status. _doctor_one now skips check #4 (produces valid JSON on a synthetic payload) for any entry without an allowlist record, and prints an explicit "ℹ skipped JSON smoke test — not allowlisted yet" note instead. Allowlisted entries still run the smoke test unchanged, so the drift-detection workflow continues to work. New test_unallowlisted_script_is_not_executed uses a sentinel-file pattern: the hook script would touch a file on disk if spawned, and the test asserts the file does not exist after running doctor.

`hermes hooks test pre_api_request` fed scripts a stdin carrying only {"session_id": "test-session"}, while the real firing site at run_agent.py passes task_id, platform, model, provider, base_url, api_mode, api_call_count, message_count, tool_count, approx_input_tokens, request_char_count, max_tokens — and post_api_request additionally passes api_duration, finish_reason, response_model, usage, assistant_content_chars, and assistant_tool_call_count. A script validated under `hooks test` could then break in production on a key that `hooks test` never surfaced. Replace the one-key synthetic dict for each of pre_api_request and post_api_request in _DEFAULT_PAYLOADS with realistic kwargs mirroring the invoke_hook() call sites. Uses claude-sonnet-4-6 for the model field to match a typical Hermes provider rather than the previous gpt-4 placeholder.

github-actions · 2026-04-20T16:36:18Z

⚠️ Supply Chain Risk Detected

This PR contains patterns commonly associated with supply chain attacks. This does not mean the PR is malicious — but these patterns require careful human review before merging.

⚠️ WARNING: Dependency manifest files modified

Changes to dependency files can introduce new packages or change version pins. Verify all dependency changes are intentional and from trusted sources.

Files:

pyproject.toml

Automated scan triggered by supply-chain-audit. If this is a false positive, a maintainer can approve after manual review.

…accept-hooks Two issues around the CLI entrypoint gating: `hermes gateway status`, `gateway setup`, `gateway install`, and friends all triggered discover_plugins() + register_from_config() at CLI startup — even though none of them run the agent in-process. Same story for `hermes mcp list`, `mcp add`, `mcp test`, etc. The `_AGENT_COMMANDS` gate was too coarse: it listed the whole `gateway` and `mcp` command groups regardless of subcommand. Management commands would hit a first-use consent prompt for a hook the user was still inspecting, or pay plugin-discovery cost with no agent loop to run. Narrow the gate the same way we did for `cron`: keep the root `gateway` / `mcp` strings out of `_AGENT_COMMANDS`, and fall through to dedicated `_GATEWAY_AGENT_SUBCOMMANDS = {"run"}` and `_MCP_AGENT_SUBCOMMANDS = {"serve"}` sets. Only the subcommands that actually invoke the agent loop in-process now register hooks. Separately, `--accept-hooks` was only wired on the top-level parser and the `chat` subparser, so `hermes gateway run --accept-hooks` and `hermes gateway --accept-hooks run` both errored out with `unrecognized arguments`. Only `hermes --accept-hooks gateway run` worked. That blocked the documented headless opt-in path for gateway/cron/mcp/acp launches. Add `--accept-hooks` (with `default=argparse.SUPPRESS` so subparser parsing doesn't clobber a truthy top-level value) to every agent subparser: `gateway`, `gateway run`, `cron`, `cron run`, `cron tick`, `mcp`, `mcp serve`, `acp`. The flag now works at every position along the command path. Parametrised smoke test in `test_argparse_flag_propagation.py` invokes each variant via `python -m hermes_cli.main` and asserts exit 0 with no `unrecognized arguments` in stderr. Also drop the redundant `accept_hooks=bool(_cfg.get("hooks_auto_accept"))` pass-through in `gateway/run.py`. `register_from_config` already resolves the effective accept flag via `_resolve_effective_accept` from the same cfg plus `HERMES_ACCEPT_HOOKS`, so re-reading `hooks_auto_accept` at the call site was pure duplication. Pass `accept_hooks=False` and let the internal resolver do the single check.

A hook command like ``python3 /path/hook.py`` or ``bash hook.sh`` is valid: ``_spawn`` runs it via ``shlex.split`` + ``subprocess.run(..., shell=False)`` and it executes fine. But the inspection helpers ``script_is_executable`` and ``script_mtime_iso`` ran ``_command_script_path`` which returned only the first shlex token, so they ended up checking ``python3`` or ``bash`` against the on-disk path rather than the script the user actually cares about. Consequences visible to the user: - ``hermes hooks doctor`` reported ``✗ script missing or not executable`` for every interpreter-prefixed hook, even when the script ran successfully at runtime. - The "modified since approval" drift check tracked the mtime of the interpreter (which basically never changes) instead of the script, so real script edits after approval went undetected. Rewrite ``_command_script_path`` with three-step resolution: prefer tokens ending in a known script extension (``.sh``/``.py``/``.rb``/...), fall back to the first token containing a path separator or leading ``~``, fall back to the first token for bare names. That handles the common cases — plain script paths, ``INTERP SCRIPT`` invocations, ``/usr/bin/env INTERP SCRIPT`` shebangs, and interpreter flags like ``python3 -u /path/hook.py`` — while preserving the existing behaviour for bare executables like ``/bin/echo hi``. Parametric regression test in ``TestAllowlistConcurrency`` covers 15 commands spanning every resolution branch.

``register_from_config`` acquired ``_registered_lock`` per spec and then, inside that critical section, called ``_prompt_and_record`` which issues a blocking ``input()`` on the TTY. Any other thread that reached ``register_from_config`` parked on the lock until the user answered the prompt — even if it was trying to register a completely different spec that wouldn't have prompted at all. Single-threaded startup callers (CLI ``main()`` and ``GatewayRunner.start()``) make this benign today, but the shape of the code defeats the very reason the lock is there. Flagged in three consecutive review rounds. Split the critical section: the lock now wraps only the idempotence check (+ allowlist read) at the top of the loop and the mutation (appending the callback, updating ``_registered``) at the bottom. The TTY prompt runs outside the lock. A defensive re-check on the idempotence key inside the second lock section keeps double registration from landing if a future caller parallelises this path — two threads that both clear the prompt for the same spec would otherwise each append a duplicate callback. Also tightens the "not allowlisted" warning message: the previous ``Run hermes hooks list`` recommendation doesn't actually approve anything; the three real opt-in channels are ``--accept-hooks``, ``HERMES_ACCEPT_HOOKS=1``, and ``hooks_auto_accept: true``.

Two user-facing diagnostic gaps: ``matcher:`` is only consulted for ``pre_tool_call`` / ``post_tool_call`` — the runtime callback ignores it for every other event. Misconfigured hooks like pre_llm_call: - matcher: "terminal" command: ~/hook.sh used to parse silently and then fire on every LLM call regardless of the matcher string, which is the opposite of what the user wanted. ``hermes hooks list`` even printed the non-functional matcher back as if it were active. ``_parse_single_entry`` now warns and drops the matcher at parse time so the rendered spec matches runtime behaviour. When ``save_allowlist`` fails (disk full, read-only ``~/.hermes``, permission denied) the caller already logged a generic warning, but the message didn't tell the user what that meant for the next run — the approval lands in the in-memory ``_registered`` set and the hook registers for the current process, yet the next startup re-prompts because the approval never hit disk. Without context the symptom looks like "hermes keeps asking me about the same hook". Expand the warning to include the allowlist path, the errno, and the re-prompt caveat explicitly. Tests: - ``test_non_tool_event_matcher_warns_and_drops`` — configures a matcher on ``pre_llm_call`` and asserts the spec's matcher ends up ``None`` and a warning containing ``pre_llm_call`` and "only honored for pre_tool_call" is emitted. - ``test_save_allowlist_failure_logs_actionable_warning`` — monkeypatches ``tempfile.mkstemp`` to raise ``OSError(28)``, confirms the warning names the file path, the errno message, and the re-prompt behaviour.

github-actions · 2026-04-20T18:01:05Z

⚠️ Supply Chain Risk Detected

This PR contains patterns commonly associated with supply chain attacks. This does not mean the PR is malicious — but these patterns require careful human review before merging.

⚠️ WARNING: Dependency manifest files modified

Changes to dependency files can introduce new packages or change version pins. Verify all dependency changes are intentional and from trusted sources.

Files:

pyproject.toml

Automated scan triggered by supply-chain-audit. If this is a false positive, a maintainer can approve after manual review.

No behaviour change, 143/143 tests still green. Follow-up cleanup on the last four commits (a5ea68f, 262f547, 7412d4e, f54a348) that leaned too heavily on repetition and prose. - Factor the eight duplicated ``--accept-hooks`` argparse blocks in ``hermes_cli/main.py`` into a single ``_add_accept_hooks_flag()`` helper. Drops ~45 lines across gateway/cron/mcp/acp subparsers. - Collapse the three ``_*_AGENT_SUBCOMMANDS`` sets + the four-branch ``or``-chain that consumed them into a single ``_AGENT_SUBCOMMANDS`` dict keyed by command name with ``(attr, allowed_set)`` values. The gate is now a two-line lookup. - Trim the oversized docstrings on ``_command_script_path`` and ``save_allowlist`` back to the core invariant — the behaviour is already covered by tests. - Reduce the refactor-note block inside ``register_from_config`` to three lines; the structure (check under lock, prompt outside, re-check on mutation) is visible from the code. - Shrink the two new diagnostic tests (non-tool matcher warning, save-failure warning) from long-form docstring + explicit helpers to tight inline assertions. Net: −132 lines (213 removed, 81 added) across three files.

github-actions · 2026-04-20T18:09:01Z

⚠️ Supply Chain Risk Detected

This PR contains patterns commonly associated with supply chain attacks. This does not mean the PR is malicious — but these patterns require careful human review before merging.

⚠️ WARNING: Dependency manifest files modified

Changes to dependency files can introduce new packages or change version pins. Verify all dependency changes are intentional and from trusted sources.

Files:

pyproject.toml

Automated scan triggered by supply-chain-audit. If this is a false positive, a maintainer can approve after manual review.

Found during end-to-end testing of the feature. After the earlier commit taught ``_command_script_path`` to return the actual script for interpreter-prefixed commands (``python3 /path/hook.py``, ``/usr/bin/env bash hook.sh``), ``script_is_executable`` kept requiring ``os.X_OK`` on that path. But when the command is ``python3 hook.py`` there is no need for the script to be executable — the interpreter opens and reads it. ``hermes hooks doctor`` reported ``✗ script missing or not executable`` for hooks that ran fine at runtime, which is user-hostile noise. Narrow the check: if the resolved script path equals ``argv[0]`` the invocation is bare (``/path/hook.sh``) and X_OK is required; otherwise the script is an argument to an interpreter and R_OK is enough. Mirrors what ``subprocess.run`` would actually do. New ``test_script_is_executable_handles_interpreter_prefix`` covers both paths and the mode change.

github-actions · 2026-04-20T18:14:43Z

⚠️ Supply Chain Risk Detected

This PR contains patterns commonly associated with supply chain attacks. This does not mean the PR is malicious — but these patterns require careful human review before merging.

⚠️ WARNING: Dependency manifest files modified

Changes to dependency files can introduce new packages or change version pins. Verify all dependency changes are intentional and from trusted sources.

Files:

pyproject.toml

Automated scan triggered by supply-chain-audit. If this is a false positive, a maintainer can approve after manual review.

# Conflicts: # hermes_cli/main.py # website/docs/user-guide/features/hooks.md

Three independent reviews surfaced a handful of real bugs. Fixing all of them here: * **SIGTERM orphans hook subprocesses (codex #1).** The CLI only installed a SIGINT handler — SIGTERM (from ``kill``, ``timeout``, systemd stop, CI harnesses) skips atexit entirely and leaves every in-flight hook subprocess running as an orphan owned by init. Adds ``_async_pool_sigterm_handler`` which terminates tracked subprocess groups inline, then routes to ``sys.exit(128 + SIGTERM)``. Inline termination is required because ``ThreadPoolExecutor`` uses non-daemon threads: Python waits for every worker to return before running atexit, and workers block inside ``proc.communicate(timeout=spec.timeout)`` until the subprocess dies. Renamed ``_maybe_install_sigint_handler`` → ``_maybe_install_signal_handlers`` (with back-compat alias). Verified: ``kill -TERM`` on a hermes CLI running a 4 s ``sleep`` hook now exits in ~0.7 s with no orphan, was 4 s + orphan. * **Subprocess groups for reliable termination.** Hooks are now spawned with ``start_new_session=True`` so the subprocess is its own PGID leader. Shutdown / SIGINT / SIGTERM paths call ``os.killpg`` on the group instead of ``proc.terminate()`` — without this, a bash script's orphaned ``sleep`` child kept the parent stdout FD open and blocked ``proc.communicate`` for the full sleep duration. ``_terminate_group`` / ``_kill_group`` helpers fall back to plain ``terminate`` / ``kill`` on edge cases where ``getpgid`` fails (already-exited proc, non-POSIX). * **``hermes hooks test --no-wait`` blocks for full hook runtime (codex #2).** The flag advertised fire-and-forget but the CLI's ``ThreadPoolExecutor`` atexit ``pool.shutdown(wait=True)`` joined the worker anyway, which in turn waited for the subprocess. ``_cmd_test`` now polls briefly for ``_live_procs`` to fill (so the subprocess definitely spawned), then ``os._exit(0)`` — skipping atexit entirely. The subprocess keeps running under init because of ``start_new_session=True``. Verified: CLI exit dropped from 2.3 s to 76 ms for a 2-second hook, and the hook still writes its audit log 3 s later after the CLI is gone. * **Stale ``_child_role_for_batch`` test (claude #1 / hermes #2).** The test from commit 76d3ffd4 asserted the *old* helper field name — no code path sets it post-refactor (455c136f), so the test passed trivially without verifying anything. Fixed to assert ``_child_role`` (the real field) is stripped, and added an explanatory message so a future failure is easier to diagnose. Module-header docstring updated too. * **``submit()`` RuntimeError branch: stale-semaphore parity fix (claude #3).** Same pattern I already fixed in ``_on_async_future_done``, missed here: a concurrent ``_reset_async_pool`` between ``acquire`` and ``release`` would cause ``_async_sem_get()`` to lazy-create a fresh sem and over- release on it. Snapshot ``_async_sem_inst`` + swallow ``ValueError`` like the symmetric path. * **Shutdown race: proc registered after the snapshot (claude #4 / hermes #1).** Worker that got between ``subprocess.Popen()`` and ``_register_live_proc(proc)`` would miss the shutdown-sweep snapshot and block for the full ``spec.timeout``. After registering, the worker now checks ``_async_shutting_down`` and self-terminates its subprocess group. * **WARN log noise on SIGTERM'd children (claude #5).** Shutdown-induced exits (rc = -15 / -9) no longer spam a per-proc ``WARNING`` — demoted to ``DEBUG`` when ``_async_shutting_down`` is set. Both the atexit path and the signal handlers now set the flag before terminating, so a Ctrl-C or a ``kill -TERM`` with 10 running hooks emits zero warn lines instead of 10. Still outstanding (documented trade-offs, not fixed here): * Gateway shutdown blocks the event loop for up to ``grace_seconds`` (claude #2). Acknowledged as a follow-up candidate via ``loop.run_in_executor``. * ``_maybe_install_signal_handlers`` is still leading-underscore (claude NousResearch#6). Cosmetic; kept consistent with the rest of the module's private-by-convention API. All 101 hook tests still pass.

pefontana added 5 commits April 17, 2026 17:44

Add subagent_stop plugin hook event

d715bab

New event fired once per child after delegate_task completes. This commit reserves the event name in the dispatcher vocabulary; the actual firing site in delegate_tool lands separately.

pefontana added 4 commits April 20, 2026 13:33

pefontana added 4 commits April 20, 2026 14:56

Merge remote-tracking branch 'origin/main' into shell-hooks

cb7ecd8

# Conflicts: # hermes_cli/main.py # website/docs/user-guide/features/hooks.md

pefontana changed the title ~~Shell hooks~~ Shell Hooks Apr 20, 2026

pefontana mentioned this pull request Apr 22, 2026

Async hooks #5

Draft

3 tasks

pefontana mentioned this pull request Apr 24, 2026

Async hooks NousResearch/hermes-agent#15312

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shell Hooks#4

Shell Hooks#4
pefontana wants to merge 18 commits into
mainfrom
shell-hooks

pefontana commented Apr 17, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 17, 2026

Uh oh!

github-actions Bot commented Apr 17, 2026

Uh oh!

github-actions Bot commented Apr 17, 2026

Uh oh!

github-actions Bot commented Apr 20, 2026

Uh oh!

github-actions Bot commented Apr 20, 2026

Uh oh!

github-actions Bot commented Apr 20, 2026

Uh oh!

github-actions Bot commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

pefontana commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Example — block rm -rf in bash

Supported events

Security model

hermes hooks CLI

Wire protocol (for hook authors)

Interpreter-prefixed commands

Type of Change

How to Test

Backward Compatibility

Checklist

Uh oh!

github-actions Bot commented Apr 17, 2026

⚠️ Supply Chain Risk Detected

⚠️ WARNING: Dependency manifest files modified

Uh oh!

github-actions Bot commented Apr 17, 2026

⚠️ Supply Chain Risk Detected

⚠️ WARNING: Dependency manifest files modified

Uh oh!

github-actions Bot commented Apr 17, 2026

⚠️ Supply Chain Risk Detected

⚠️ WARNING: Dependency manifest files modified

Uh oh!

github-actions Bot commented Apr 20, 2026

⚠️ Supply Chain Risk Detected

⚠️ WARNING: Dependency manifest files modified

Uh oh!

github-actions Bot commented Apr 20, 2026

⚠️ Supply Chain Risk Detected

⚠️ WARNING: Dependency manifest files modified

Uh oh!

github-actions Bot commented Apr 20, 2026

⚠️ Supply Chain Risk Detected

⚠️ WARNING: Dependency manifest files modified

Uh oh!

github-actions Bot commented Apr 20, 2026

⚠️ Supply Chain Risk Detected

⚠️ WARNING: Dependency manifest files modified

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

pefontana commented Apr 17, 2026 •

edited

Loading

Example — block `rm -rf` in bash

`hermes hooks` CLI