fix: inject plugin context after cache markers to preserve Anthropic … by OutThisLife · Pull Request #5138 · NousResearch/hermes-agent

OutThisLife · 2026-04-04T23:06:51Z

What does this PR do?

Plugin context from pre_llm_call hooks was being appended into the system prompt before cache markers were applied. If a plugin returned anything that varied per turn, the system prompt hash changed every turn and Anthropic couldn't reuse the cached prefix. The code comment said "ephemeral, not cached" but the implementation did cache it.

Now when prompt caching is on, plugin context is appended as a plain text block after cache markers are placed, so it sits outside the cached prefix hash.

Related Issue

Fixes # (no existing issue — found during prompt caching audit)

Type of Change

🐛 Bug fix (non-breaking change that fixes an issue)

Changes Made

run_agent.py: gate plugin context injection on not self._use_prompt_caching; when caching is on, append it as an unmarked block after apply_anthropic_cache_control runs
tests/test_run_agent.py: add test_plugin_context_is_uncached_system_suffix_when_prompt_caching_enabled — asserts the plugin block has no cache_control while the system prompt block does

How to Test

pytest tests/test_run_agent.py tests/agent/test_prompt_caching.py tests/test_anthropic_adapter.py -k cache -q — 21 pass
pytest tests/test_run_agent.py -k plugin_context_is_uncached -q — 1 pass
Live: enable a plugin returning turn-varying pre_llm_call context, run multi-turn Claude session, check cache_read_input_tokens stays non-zero

Checklist

Code

I've read the Contributing Guide
My commit messages follow Conventional Commits
I searched for existing PRs to make sure this isn't a duplicate
My PR contains only changes related to this fix
I've run pytest tests/ -q and all tests pass
I've added tests for my changes
I've tested on my platform: WSL2 Ubuntu

Documentation & Housekeeping

N/A — no config, schema, or architecture changes

Screenshots / Logs

…prompt cache prefix stability

…t cache Plugin context from pre_llm_call hooks was injected into the system prompt, breaking the prompt cache prefix every turn when content changed (typical for memory plugins). Now all plugin context goes into the current turn's user message — the system prompt stays identical across turns, preserving cached tokens. The system prompt is reserved for Hermes internals. Plugins contribute context alongside the user's input. Also adds comprehensive documentation for all 6 plugin hooks: pre_tool_call, post_tool_call, pre_llm_call, post_llm_call, on_session_start, on_session_end — each with full callback signatures, parameter tables, firing conditions, and examples. Supersedes #5138 which identified the same cache-busting bug and proposed an uncached system suffix approach. This fix goes further by removing system prompt injection entirely. Co-identified-by: OutThisLife (PR #5138)

…t cache (#5146) Plugin context from pre_llm_call hooks was injected into the system prompt, breaking the prompt cache prefix every turn when content changed (typical for memory plugins). Now all plugin context goes into the current turn's user message — the system prompt stays identical across turns, preserving cached tokens. The system prompt is reserved for Hermes internals. Plugins contribute context alongside the user's input. Also adds comprehensive documentation for all 6 plugin hooks: pre_tool_call, post_tool_call, pre_llm_call, post_llm_call, on_session_start, on_session_end — each with full callback signatures, parameter tables, firing conditions, and examples. Supersedes #5138 which identified the same cache-busting bug and proposed an uncached system suffix approach. This fix goes further by removing system prompt injection entirely. Co-identified-by: OutThisLife (PR #5138)

teknium1 · 2026-04-04T23:55:54Z

Merged via PR #5146 which fixes the same cache-busting bug you identified here — plugin context from pre_llm_call hooks was breaking the prompt cache prefix every turn.

Your PR correctly diagnosed the problem and proposed injecting plugin context as an uncached system suffix block after cache markers. #5146 goes further by removing system prompt injection entirely — all plugin context now goes into the user message, so the system prompt never changes. Simpler contract, zero cache interference.

Thanks for catching this, Brooklyn — your PR validated the fix direction. Credit noted in the commit message.

…t cache (NousResearch#5146) Plugin context from pre_llm_call hooks was injected into the system prompt, breaking the prompt cache prefix every turn when content changed (typical for memory plugins). Now all plugin context goes into the current turn's user message — the system prompt stays identical across turns, preserving cached tokens. The system prompt is reserved for Hermes internals. Plugins contribute context alongside the user's input. Also adds comprehensive documentation for all 6 plugin hooks: pre_tool_call, post_tool_call, pre_llm_call, post_llm_call, on_session_start, on_session_end — each with full callback signatures, parameter tables, firing conditions, and examples. Supersedes NousResearch#5138 which identified the same cache-busting bug and proposed an uncached system suffix approach. This fix goes further by removing system prompt injection entirely. Co-identified-by: OutThisLife (PR NousResearch#5138)

…t cache Plugin context from pre_llm_call hooks was injected into the system prompt, breaking the prompt cache prefix every turn when content changed (typical for memory plugins). Now all plugin context goes into the current turn's user message — the system prompt stays identical across turns, preserving cached tokens. The system prompt is reserved for Hermes internals. Plugins contribute context alongside the user's input. Also adds comprehensive documentation for all 6 plugin hooks: pre_tool_call, post_tool_call, pre_llm_call, post_llm_call, on_session_start, on_session_end — each with full callback signatures, parameter tables, firing conditions, and examples. Supersedes NousResearch#5138 which identified the same cache-busting bug and proposed an uncached system suffix approach. This fix goes further by removing system prompt injection entirely. Co-identified-by: OutThisLife (PR NousResearch#5138)

…t cache (NousResearch#5146) Plugin context from pre_llm_call hooks was injected into the system prompt, breaking the prompt cache prefix every turn when content changed (typical for memory plugins). Now all plugin context goes into the current turn's user message — the system prompt stays identical across turns, preserving cached tokens. The system prompt is reserved for Hermes internals. Plugins contribute context alongside the user's input. Also adds comprehensive documentation for all 6 plugin hooks: pre_tool_call, post_tool_call, pre_llm_call, post_llm_call, on_session_start, on_session_end — each with full callback signatures, parameter tables, firing conditions, and examples. Supersedes NousResearch#5138 which identified the same cache-busting bug and proposed an uncached system suffix approach. This fix goes further by removing system prompt injection entirely. Co-identified-by: OutThisLife (PR NousResearch#5138)

fix: inject plugin context after cache markers to preserve Anthropic …

cc66b66

…prompt cache prefix stability

teknium1 mentioned this pull request Apr 4, 2026

fix: move pre_llm_call plugin context to user message, preserve prompt cache #5146

Merged

teknium1 closed this Apr 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: inject plugin context after cache markers to preserve Anthropic …#5138

fix: inject plugin context after cache markers to preserve Anthropic …#5138
OutThisLife wants to merge 1 commit into
mainfrom
feat/fix-plugin-cache-prefix

OutThisLife commented Apr 4, 2026

Uh oh!

teknium1 commented Apr 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

OutThisLife commented Apr 4, 2026

What does this PR do?

Related Issue

Type of Change

Changes Made

How to Test

Checklist

Code

Documentation & Housekeeping

Screenshots / Logs

Uh oh!

teknium1 commented Apr 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants