Version observed: v0.13.0 (release tag 2026-05-07 per RELEASE_v0.13.0.md)
Discovered during: M2 §5.6 audit-schema enforcement work in a downstream Hermes-based orchestrator project (private dev-notes; happy to share specific excerpts).
Diagnostic plugin (minimal reproducer): see §3 below.
Summary
Two distinct symptoms with similar shape:
-
pre_tool_call shell hooks do not fire from the worker's tool-dispatch path when the worker is spawned by the kanban dispatcher OR invoked manually as hermes --accept-hooks -p <profile> --skills kanban-worker chat -q "<prompt>". Registration succeeds, but invoke_hook("pre_tool_call", ...) is never called for the worker's kanban_show, write_file, terminal, kanban_complete, etc.
-
on_session_finalize shell hooks beyond the first one in config also do not fire for clean kanban-worker sessions. A second on_session_finalize hook (registered alongside the existing safety-net hook) does not fire on a worker that calls kanban_complete cleanly and exits 0. The same hook fires perfectly when invoked directly as a subprocess.
Both failures look the same from above: registration succeeds (hermes hooks list shows ✓ allowed; agent.log confirms INFO agent.shell_hooks: shell hook registered: ...), the hook script is correct and direct-invokable, but the worker's session lifecycle never triggers the relevant invoke_hook call.
Why this matters for downstream projects
The Hermes Orchestrator project's M2 milestone needed structural enforcement of kanban_complete(metadata=...) schema (PAF §4.2 / kickoff §4.2). Two natural Hermes-canonical implementations failed for the reasons above:
pre_tool_call hook scoped to matcher=kanban_complete — would reject invalid metadata BEFORE the task transitions to done. Cannot be wired in worker context.
on_session_finalize hook — would validate runs[-1].metadata AFTER the worker exits but while the substrate can still annotate / spawn follow-up tasks. Cannot be wired either.
The project's workaround is a cron-driven scanner that runs audit_validate_recent.py every 5 min. This loses structural strength (the task is done before invalid metadata is detected) and adds operational complexity (cron registration, retry hygiene). With either hook mechanism working in worker context, the workaround would be unnecessary.
Repro 1 — pre_tool_call does not fire in worker
Setup
-
Add a Python plugin to ~/.hermes/plugins/m2_5_6_diagnostic/ with:
plugin.yaml:
name: m2_5_6_diagnostic
version: 0.0.1
description: "pre_tool_call dispatch diagnostic"
author: "anyone"
hooks:
- pre_tool_call
__init__.py:
import datetime, os, pathlib
TRACE = pathlib.Path.home() / ".hermes" / "m2_5_6_plugin_trace.log"
def _trace(event, **kw):
try:
with TRACE.open("a", encoding="utf-8") as f:
f.write(f"{datetime.datetime.utcnow().isoformat()}\tpid={os.getpid()}\tevent={event}\t{kw}\n")
except Exception:
pass
def _pre_tool_call(**kwargs):
_trace("pre_tool_call", **kwargs)
return None # observer-only
def register(ctx):
_trace("register_called")
ctx.register_hook("pre_tool_call", _pre_tool_call)
_trace("register_done")
-
Enable the plugin: hermes plugins enable m2_5_6_diagnostic.
-
Verify in-process firing works:
hermes-agent/venv/bin/python -c "
from hermes_cli.plugins import discover_plugins, get_plugin_manager
discover_plugins(force=True)
get_plugin_manager().invoke_hook('pre_tool_call', tool_name='test', args={}, task_id='t', session_id='s', tool_call_id='c')
"
cat ~/.hermes/m2_5_6_plugin_trace.log
# Expect: 3 trace lines (register_called, register_done, pre_tool_call)
-
Submit a fresh kanban task that exercises a tool call:
rm -f ~/.hermes/m2_5_6_plugin_trace.log
hermes kanban create "diagnostic probe $(date +%s)" \
--assignee claude-code-worker \
--idempotency-key "diag-$(date +%s%N)" \
--body "Call kanban_show, write a trivial file, then call kanban_complete. Any tool call is fine."
sleep 90
cat ~/.hermes/m2_5_6_plugin_trace.log
Expected
The trace log should contain at least one pre_tool_call line per tool the worker invoked (kanban_show, possibly write_file, kanban_complete).
Observed
The trace log is not created, even though the worker's session log shows multiple tool calls completing. The plugin's register(ctx) is called (verified by trace line) only in the in-process REPL test from step 3 — NOT in the worker subprocess.
Manual worker invocation produces the same result:
HERMES_KANBAN_TASK=t_existing hermes --accept-hooks -p claude-code-worker --skills kanban-worker chat -q "say hi" 2>&1
# Worker runs 0 tool calls — but even when given a prompt that invokes tools, no trace lines appear.
Suspected location
run_agent.py:_invoke_tool at line 10452 explicitly calls get_pre_tool_call_block_message from hermes_cli.plugins before tool execution. Either:
- This code path is not reached by the worker subprocess (the worker uses a different dispatch path); OR
- The worker subprocess's plugin manager has no registered callbacks for
pre_tool_call because register_from_config was bypassed or silently failed during worker startup.
main.py:11800-11808 calls register_from_config(load_config(), accept_hooks=_accept_hooks) for args.command in _AGENT_COMMANDS = {None, "chat", "acp", "rl"} — chat -q should match. The wrapping try/except logs failures at DEBUG only, so silent failures are plausible.
Repro 2 — on_session_finalize does not fire for clean kanban-worker sessions
Setup
-
Add a second on_session_finalize shell hook to ~/.hermes/config.yaml alongside the existing one (the project has an M1-deployed safety-net hook):
hooks:
on_session_finalize:
- command: /home/<user>/projects/<project>/.hermes/hooks/safety_net.py
timeout: 30
- command: /home/<user>/projects/<project>/.hermes/hooks/diagnostic.py
timeout: 30
diagnostic.py is a bare-script hook with #!/usr/bin/env python3 shebang that writes one line to a trace log on every invocation:
#!/usr/bin/env python3
import datetime, os, sys, pathlib
TRACE = pathlib.Path.home() / ".hermes" / "on_session_finalize_trace.log"
with TRACE.open("a", encoding="utf-8") as f:
f.write(f"{datetime.datetime.utcnow().isoformat()}\tpid={os.getpid()}\tkanban_task={os.environ.get('HERMES_KANBAN_TASK','<none>')}\n")
sys.stdout.write("{}\n") # required by hooks contract
chmod +x the file. Allowlist its (event, command) pair in ~/.hermes/shell-hooks-allowlist.json.
-
Restart the gateway (hermes gateway stop && nohup env ORCH_M2_GATES=1 hermes gateway run > log 2>&1 &).
-
Confirm both hooks are registered:
tail -10 ~/.hermes/logs/agent.log | grep "shell hook registered"
# Expect: 2 entries for on_session_finalize (safety_net, diagnostic)
-
Verify direct invocation works:
rm -f ~/.hermes/on_session_finalize_trace.log
HERMES_KANBAN_TASK=t_test echo '{}' | /path/to/diagnostic.py
cat ~/.hermes/on_session_finalize_trace.log
# Expect: 1 trace line
-
Submit a kanban task that the worker will complete cleanly:
rm -f ~/.hermes/on_session_finalize_trace.log
hermes kanban create "diag $(date +%s)" \
--assignee claude-code-worker \
--idempotency-key "diag-$(date +%s%N)" \
--body "Trivial work. Call kanban_complete when done."
sleep 90
cat ~/.hermes/on_session_finalize_trace.log
Expected
The trace log should contain at least one entry (the worker's session ending fires on_session_finalize, invoking all registered callbacks including the diagnostic).
Observed
The trace log is not created. Direct invocation works perfectly. Both hooks are registered (confirmed via agent.log AND hermes hooks list). The worker completes the task cleanly and the session ends — but invoke_hook("on_session_finalize", ...) is not called from this code path.
The existing safety-net hook (1st in config) does fire for protocol-violation sessions (M1 evidence: two prior tasks where workers crashed with HTTP 400; specific task IDs available on request). We have no direct evidence it fires for clean sessions either — it just silent-exits when the task is already done, so the lack of side effects doesn't disprove firing.
Suspected location
cli.py:728 calls invoke_hook("on_session_finalize", session_id=..., platform="cli") at "actual session boundary" per the surrounding comment. gateway/run.py:8099 has a similar call for gateway-tracked sessions. For a worker invoked as hermes ... chat -q "...", the cli.py call site should fire. Either:
- The
chat -q (quiet) code path bypasses the cli.py:728 invocation entirely; OR
- It reaches the call but
invoke_hook returns immediately because the worker's plugin manager has no callbacks registered for the event (despite agent.log confirming gateway-side registration).
Common pattern
Both repros share this property: the hook is correctly registered (per hermes hooks list AND per agent.log shell hook registered INFO line) and works perfectly when invoked directly as a subprocess. But the worker's session lifecycle never triggers invoke_hook(...) for the registered hook.
The §5.6 work spent two hours in the original investigation + several more hours during the Path A/B confirmations, without isolating which code path the worker takes that bypasses the hook dispatch. A maintainer with knowledge of the worker spawn lifecycle could likely resolve this in minutes.
Workaround (for downstream visibility)
The Hermes Orchestrator project closed M2 §5.6 by pivoting to a cron-driven scanner that bypasses the hook system entirely:
*/5 * * * * /path/to/audit_validate_recent.py --once --apply
The scanner queries hermes kanban list --json --status done, filters by completed_at window, validates runs[-1].metadata against the project schema, and applies remediation (kanban_comment + AUDIT_REVIEW follow-up task) for invalid completions. Works reliably; loses ≤5 min detection latency vs. on_session_finalize.
Asks
- Could a maintainer confirm whether
register_from_config actually runs (and completes) in kanban-spawned worker subprocesses? If it does, the issue is in invoke_hook dispatch; if it doesn't, the issue is registration.
- If the worker subprocess uses a distinct dispatch path that intentionally skips plugin hooks (for performance, isolation, etc.), please document this in
agent/shell_hooks.py docstring + the user-guide hooks page. Downstream consumers will hit the same dead-end without that guidance.
- The
--accept-hooks / hooks_auto_accept / HERMES_ACCEPT_HOOKS chain is well-documented for first-use TTY consent, but its interaction with kanban-spawned non-TTY workers is empirically uneven. Worth a §3 example in the docs.
Happy to provide additional artifacts (worker session logs, agent.log snapshots, full plugin source). Two days of investigation across two separate paths (Path A on pre_tool_call, Path B on on_session_finalize) are documented in the downstream project's private dev-notes; happy to share specific excerpts on request.
Reproducer plugin is in place on the affected install — happy to share its source if helpful.
Version observed: v0.13.0 (release tag 2026-05-07 per
RELEASE_v0.13.0.md)Discovered during: M2 §5.6 audit-schema enforcement work in a downstream Hermes-based orchestrator project (private dev-notes; happy to share specific excerpts).
Diagnostic plugin (minimal reproducer): see §3 below.
Summary
Two distinct symptoms with similar shape:
pre_tool_callshell hooks do not fire from the worker's tool-dispatch path when the worker is spawned by the kanban dispatcher OR invoked manually ashermes --accept-hooks -p <profile> --skills kanban-worker chat -q "<prompt>". Registration succeeds, butinvoke_hook("pre_tool_call", ...)is never called for the worker'skanban_show,write_file,terminal,kanban_complete, etc.on_session_finalizeshell hooks beyond the first one in config also do not fire for clean kanban-worker sessions. A second on_session_finalize hook (registered alongside the existing safety-net hook) does not fire on a worker that callskanban_completecleanly and exits 0. The same hook fires perfectly when invoked directly as a subprocess.Both failures look the same from above: registration succeeds (
hermes hooks listshows ✓ allowed;agent.logconfirmsINFO agent.shell_hooks: shell hook registered: ...), the hook script is correct and direct-invokable, but the worker's session lifecycle never triggers the relevantinvoke_hookcall.Why this matters for downstream projects
The Hermes Orchestrator project's M2 milestone needed structural enforcement of
kanban_complete(metadata=...)schema (PAF §4.2 / kickoff §4.2). Two natural Hermes-canonical implementations failed for the reasons above:pre_tool_callhook scoped tomatcher=kanban_complete— would reject invalidmetadataBEFORE the task transitions todone. Cannot be wired in worker context.on_session_finalizehook — would validateruns[-1].metadataAFTER the worker exits but while the substrate can still annotate / spawn follow-up tasks. Cannot be wired either.The project's workaround is a cron-driven scanner that runs
audit_validate_recent.pyevery 5 min. This loses structural strength (the task isdonebefore invalid metadata is detected) and adds operational complexity (cron registration, retry hygiene). With either hook mechanism working in worker context, the workaround would be unnecessary.Repro 1 —
pre_tool_calldoes not fire in workerSetup
Add a Python plugin to
~/.hermes/plugins/m2_5_6_diagnostic/with:plugin.yaml:__init__.py:Enable the plugin:
hermes plugins enable m2_5_6_diagnostic.Verify in-process firing works:
Submit a fresh kanban task that exercises a tool call:
Expected
The trace log should contain at least one
pre_tool_callline per tool the worker invoked (kanban_show, possiblywrite_file,kanban_complete).Observed
The trace log is not created, even though the worker's session log shows multiple tool calls completing. The plugin's
register(ctx)is called (verified by trace line) only in the in-process REPL test from step 3 — NOT in the worker subprocess.Manual worker invocation produces the same result:
Suspected location
run_agent.py:_invoke_toolat line 10452 explicitly callsget_pre_tool_call_block_messagefromhermes_cli.pluginsbefore tool execution. Either:pre_tool_callbecauseregister_from_configwas bypassed or silently failed during worker startup.main.py:11800-11808callsregister_from_config(load_config(), accept_hooks=_accept_hooks)forargs.command in _AGENT_COMMANDS = {None, "chat", "acp", "rl"}—chat -qshould match. The wrapping try/except logs failures at DEBUG only, so silent failures are plausible.Repro 2 —
on_session_finalizedoes not fire for clean kanban-worker sessionsSetup
Add a second on_session_finalize shell hook to
~/.hermes/config.yamlalongside the existing one (the project has an M1-deployed safety-net hook):diagnostic.pyis a bare-script hook with#!/usr/bin/env python3shebang that writes one line to a trace log on every invocation:chmod +xthe file. Allowlist its (event, command) pair in~/.hermes/shell-hooks-allowlist.json.Restart the gateway (
hermes gateway stop && nohup env ORCH_M2_GATES=1 hermes gateway run > log 2>&1 &).Confirm both hooks are registered:
Verify direct invocation works:
Submit a kanban task that the worker will complete cleanly:
Expected
The trace log should contain at least one entry (the worker's session ending fires on_session_finalize, invoking all registered callbacks including the diagnostic).
Observed
The trace log is not created. Direct invocation works perfectly. Both hooks are registered (confirmed via agent.log AND
hermes hooks list). The worker completes the task cleanly and the session ends — butinvoke_hook("on_session_finalize", ...)is not called from this code path.The existing safety-net hook (1st in config) does fire for protocol-violation sessions (M1 evidence: two prior tasks where workers crashed with HTTP 400; specific task IDs available on request). We have no direct evidence it fires for clean sessions either — it just silent-exits when the task is already
done, so the lack of side effects doesn't disprove firing.Suspected location
cli.py:728callsinvoke_hook("on_session_finalize", session_id=..., platform="cli")at "actual session boundary" per the surrounding comment.gateway/run.py:8099has a similar call for gateway-tracked sessions. For a worker invoked ashermes ... chat -q "...", the cli.py call site should fire. Either:chat -q(quiet) code path bypasses thecli.py:728invocation entirely; ORinvoke_hookreturns immediately because the worker's plugin manager has no callbacks registered for the event (despite agent.log confirming gateway-side registration).Common pattern
Both repros share this property: the hook is correctly registered (per
hermes hooks listAND peragent.logshell hook registeredINFO line) and works perfectly when invoked directly as a subprocess. But the worker's session lifecycle never triggersinvoke_hook(...)for the registered hook.The §5.6 work spent two hours in the original investigation + several more hours during the Path A/B confirmations, without isolating which code path the worker takes that bypasses the hook dispatch. A maintainer with knowledge of the worker spawn lifecycle could likely resolve this in minutes.
Workaround (for downstream visibility)
The Hermes Orchestrator project closed M2 §5.6 by pivoting to a cron-driven scanner that bypasses the hook system entirely:
The scanner queries
hermes kanban list --json --status done, filters bycompleted_atwindow, validatesruns[-1].metadataagainst the project schema, and applies remediation (kanban_comment+ AUDIT_REVIEW follow-up task) for invalid completions. Works reliably; loses ≤5 min detection latency vs. on_session_finalize.Asks
register_from_configactually runs (and completes) in kanban-spawned worker subprocesses? If it does, the issue is ininvoke_hookdispatch; if it doesn't, the issue is registration.agent/shell_hooks.pydocstring + the user-guide hooks page. Downstream consumers will hit the same dead-end without that guidance.--accept-hooks/hooks_auto_accept/HERMES_ACCEPT_HOOKSchain is well-documented for first-use TTY consent, but its interaction with kanban-spawned non-TTY workers is empirically uneven. Worth a §3 example in the docs.Happy to provide additional artifacts (worker session logs, agent.log snapshots, full plugin source). Two days of investigation across two separate paths (Path A on
pre_tool_call, Path B onon_session_finalize) are documented in the downstream project's private dev-notes; happy to share specific excerpts on request.Reproducer plugin is in place on the affected install — happy to share its source if helpful.