feat: activate plugin lifecycle hooks (pre/post_llm_call, session start/end) by teknium1 · Pull Request #3542 · NousResearch/hermes-agent

teknium1 · 2026-03-28T15:27:33Z

Summary

Salvaged from PR #2823 by @nicoloboschi.

Activates the four lifecycle hooks that were defined in the plugin system but never invoked: on_session_start, pre_llm_call, post_llm_call, on_session_end.

This enables external plugins (e.g. memory systems like Hindsight) to integrate as pip-installable plugins that hook every conversation turn, without requiring core changes.

Hook semantics

Hook	When	Can return context?
`on_session_start`	New session created (first turn)	No
`pre_llm_call`	Once per turn, before LLM loop	Yes — `{"context": "..."}` injected into ephemeral system prompt
`post_llm_call`	Once per turn, after LLM loop	No
`on_session_end`	End of every `run_conversation()` call	No

Changes from original PR

Cherry-picked both contributor commits cleanly onto current main
conversation_history passed as a shallow copy (list(messages)) to prevent plugins from mutating the live conversation
Added model and platform kwargs to on_session_end for consistency with all other hooks
Updated features/plugins.md to remove *(planned)* markers now that all hooks are active
Contributor attribution preserved via --author

Files changed

hermes_cli/plugins.py — invoke_hook() now returns List[Any] of non-None results
run_agent.py — invoke all four hooks at appropriate lifecycle points
tests/test_plugins.py — added tests for return value collection
website/docs/guides/build-a-hermes-plugin.md — updated hook reference table
website/docs/user-guide/features/plugins.md — removed (planned) markers

Test plan

All 19 plugin tests pass (including 2 new ones)
Full test suite: 4707 passed, 165 skipped, 0 failed
Live PTY test with real plugin exercising all 4 hooks — all fired correctly with correct kwargs
Verified tool-calling turns: pre_llm_call fires once before loop, post_llm_call once after
Prompt caching safe: plugin context goes into ephemeral system (rebuilt per API call), not cached system prompt

…rt/end) The plugin system defined six lifecycle hooks but only pre_tool_call and post_tool_call were invoked. This activates the remaining four so that external plugins (e.g. memory systems) can hook into the conversation loop without touching core code. Hook semantics: - on_session_start: fires once when a new session is created - pre_llm_call: fires once per turn before the tool-calling loop; plugins can return {"context": "..."} to inject into the ephemeral system prompt (not cached, not persisted) - post_llm_call: fires once per turn after the loop completes, with user_message and assistant_response for sync/storage - on_session_end: fires at the end of every run_conversation call invoke_hook() now returns a list of non-None callback return values, enabling pre_llm_call context injection while remaining backward compatible (existing hooks that return None are unaffected). Salvaged from PR #2823.

… pages Fixes found by auditing docs against recent PRs/commits: Critical (misleading): - hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks that are now active (#3542). Add correct callback signatures. - security.md: Update tirith verdict behavior — block verdicts now go through approval flow instead of hard-blocking (#3428). Add pkill/killall self-termination guard and gateway-run backgrounding patterns (#3593). New feature docs: - configuration.md: Add tool_use_enforcement section with value table (auto/true/false/list) from #3551/#3528. - configuration.md: Expand auxiliary config with per-task timeouts (compression 120s, web_extract 30s, approval 30s) from #3597. - api-server.md: Add /v1/health alias, Security Headers section, CORS details (Max-Age, SSE headers, Idempotency-Key) from #3572/#3573/#3576/#3580/#3530. Stale/incomplete: - configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (#3484). - environment-variables.md: Specify actual DashScope default URL. - cli-commands.md: Add alibaba to --provider list. - fallback-providers.md: Add Alibaba/DashScope to provider table. - email.md: Document noreply/automated sender filtering (#3606). - toolsets-reference.md: Add 4 missing platform toolsets — matrix, mattermost, dingtalk, api-server (#3583). - skills.md: List default GitHub taps including garrytan/gstack (#3605).

… pages (#3618) Fixes found by auditing docs against recent PRs/commits: Critical (misleading): - hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks that are now active (#3542). Add correct callback signatures. - security.md: Update tirith verdict behavior — block verdicts now go through approval flow instead of hard-blocking (#3428). Add pkill/killall self-termination guard and gateway-run backgrounding patterns (#3593). New feature docs: - configuration.md: Add tool_use_enforcement section with value table (auto/true/false/list) from #3551/#3528. - configuration.md: Expand auxiliary config with per-task timeouts (compression 120s, web_extract 30s, approval 30s) from #3597. - api-server.md: Add /v1/health alias, Security Headers section, CORS details (Max-Age, SSE headers, Idempotency-Key) from #3572/#3573/#3576/#3580/#3530. Stale/incomplete: - configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (#3484). - environment-variables.md: Specify actual DashScope default URL. - cli-commands.md: Add alibaba to --provider list. - fallback-providers.md: Add Alibaba/DashScope to provider table. - email.md: Document noreply/automated sender filtering (#3606). - toolsets-reference.md: Add 4 missing platform toolsets — matrix, mattermost, dingtalk, api-server (#3583). - skills.md: List default GitHub taps including garrytan/gstack (#3605).

Example plugin demonstrating the lifecycle hooks activated in NousResearch#3542. Auto-manages a local llama-server (or any OpenAI-compatible server) when the active model matches a locally configured model name. Features: - pre_llm_call hook: auto-starts the correct server on first message when hermes is configured with a local model name - on_session_end hook: kills the server on exit - switch_local_llm tool: mid-session model switching — the agent swaps the server when asked ("switch to the code model") - Declarative YAML config for model definitions (GGUF paths, context sizes, KV cache quantization, sampling params) replacing shell scripts The plugin is self-contained in docs/llm-switch-plugin-example/ with a README, example config, and full implementation. Users copy it to ~/.hermes/plugins/llm-switch/ to install. Complements NousResearch#3360 and NousResearch#3548 which restore /model as a slash command — once merged, /model custom:write would trigger the pre_llm_call hook to auto-start the right server seamlessly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…governance) Port of local governance features to upstream v0.5.0 base: - load_rules(): ~/.hermes/rules/*.md injection into system prompt - load_samples(): ~/.hermes/samples/*.md behavioral examples - load_working_state(): ~/.hermes/working_state.md cross-session context - atexit snapshot of working_state.md to checkpoints/ - Credential redaction in gateway and cron delivery Dropped: lifecycle hook wiring (superseded by upstream NousResearch#3542)

…rt/end) (NousResearch#3542) The plugin system defined six lifecycle hooks but only pre_tool_call and post_tool_call were invoked. This activates the remaining four so that external plugins (e.g. memory systems) can hook into the conversation loop without touching core code. Hook semantics: - on_session_start: fires once when a new session is created - pre_llm_call: fires once per turn before the tool-calling loop; plugins can return {"context": "..."} to inject into the ephemeral system prompt (not cached, not persisted) - post_llm_call: fires once per turn after the loop completes, with user_message and assistant_response for sync/storage - on_session_end: fires at the end of every run_conversation call invoke_hook() now returns a list of non-None callback return values, enabling pre_llm_call context injection while remaining backward compatible (existing hooks that return None are unaffected). Salvaged from PR NousResearch#2823. Co-authored-by: Nicolò Boschi <boschi1997@gmail.com>

… pages (NousResearch#3618) Fixes found by auditing docs against recent PRs/commits: Critical (misleading): - hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks that are now active (NousResearch#3542). Add correct callback signatures. - security.md: Update tirith verdict behavior — block verdicts now go through approval flow instead of hard-blocking (NousResearch#3428). Add pkill/killall self-termination guard and gateway-run backgrounding patterns (NousResearch#3593). New feature docs: - configuration.md: Add tool_use_enforcement section with value table (auto/true/false/list) from NousResearch#3551/NousResearch#3528. - configuration.md: Expand auxiliary config with per-task timeouts (compression 120s, web_extract 30s, approval 30s) from NousResearch#3597. - api-server.md: Add /v1/health alias, Security Headers section, CORS details (Max-Age, SSE headers, Idempotency-Key) from NousResearch#3572/NousResearch#3573/NousResearch#3576/NousResearch#3580/NousResearch#3530. Stale/incomplete: - configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (NousResearch#3484). - environment-variables.md: Specify actual DashScope default URL. - cli-commands.md: Add alibaba to --provider list. - fallback-providers.md: Add Alibaba/DashScope to provider table. - email.md: Document noreply/automated sender filtering (NousResearch#3606). - toolsets-reference.md: Add 4 missing platform toolsets — matrix, mattermost, dingtalk, api-server (NousResearch#3583). - skills.md: List default GitHub taps including garrytan/gstack (NousResearch#3605).