-
-
Notifications
You must be signed in to change notification settings - Fork 79.2k
[Bug]: Active memory injection breaks prompt cache hit rate (99.9% → 22%) #91223
Copy link
Copy link
Open
Labels
P2Normal backlog priority with limited blast radius.Normal backlog priority with limited blast radius.clawsweeper:needs-live-reproClawSweeper needs live local, crabbox, or manual validation to confirm this issue.ClawSweeper needs live local, crabbox, or manual validation to confirm this issue.clawsweeper:needs-maintainer-reviewClawSweeper marked this issue as needing maintainer review before automation.ClawSweeper marked this issue as needing maintainer review before automation.clawsweeper:needs-product-decisionClawSweeper marked this issue as needing a product or behavior decision.ClawSweeper marked this issue as needing a product or behavior decision.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.ClawSweeper does not recommend queueing a new automated fix PR for this issue.impact:otherThis issue has meaningful maintainer-visible impact outside the owned taxonomy.This issue has meaningful maintainer-visible impact outside the owned taxonomy.issue-rating: 🐚 platinum hermitGood issue quality with a plausible reproduction path needing some confirmation.Good issue quality with a plausible reproduction path needing some confirmation.
Metadata
Metadata
Assignees
Labels
P2Normal backlog priority with limited blast radius.Normal backlog priority with limited blast radius.clawsweeper:needs-live-reproClawSweeper needs live local, crabbox, or manual validation to confirm this issue.ClawSweeper needs live local, crabbox, or manual validation to confirm this issue.clawsweeper:needs-maintainer-reviewClawSweeper marked this issue as needing maintainer review before automation.ClawSweeper marked this issue as needing maintainer review before automation.clawsweeper:needs-product-decisionClawSweeper marked this issue as needing a product or behavior decision.ClawSweeper marked this issue as needing a product or behavior decision.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.ClawSweeper does not recommend queueing a new automated fix PR for this issue.impact:otherThis issue has meaningful maintainer-visible impact outside the owned taxonomy.This issue has meaningful maintainer-visible impact outside the owned taxonomy.issue-rating: 🐚 platinum hermitGood issue quality with a plausible reproduction path needing some confirmation.Good issue quality with a plausible reproduction path needing some confirmation.
Type
Fields
Give feedbackNo fields configured for issues without a type.
[Bug]: Active memory injection breaks prompt cache hit rate (99.9% → 22%)
Phenomenon
Enabling the
active-memoryplugin causes prompt cache hit rate to collapsefrom ~99.9% (clean baseline) to ~22% in production dashboard observations.
Reproduction across two Anthropic-compatible providers (independent test runs
by two coding agents) shows:
Root cause
The
active-memoryplugin injects recalled memories via thebefore_prompt_buildhook, returning content through
hookResult.prependContext. In the prompt-preparationlayer, this context is concatenated into the user message (not the system prompt).
For every eligible conversational reply, the plugin spawns a blocking memory
sub-agent that runs
memory_searchto recall facts. The recall query is derivedfrom the current user message, so different user messages → different recalled
fact lists → character-level changes in
prependContext.Anthropic protocol
cache_control: {type: "ephemeral"}markers are intended toisolate the stable system-prompt block from the variable user-message block.
However, the observed behavior across two Anthropic-compatible providers is that
the cache_control boundary does not work as expected: any character-level change
in the user-message block causes the entire prompt to miss cache (0% hit rate),
instead of just the variable part.
Reproduction
active-memorywithconfig.agents: ["main"]conversational turns)
Code-level trace
prependContext injection site
In the prompt preparation layer,
hookResult.prependContextis concatenateddirectly before the user prompt:
active-memory hook returns prependContext
cache_control placement
memory_search tool registration
Suggested fixes
Document the limitation: At minimum, document that enabling
active-memoryeffectively disables prompt cache for any user message that includes variable
recalled content. This is silent and undocumented.
Stabilize prependContext output: Have the active-memory plugin produce
deterministic output for a given session (e.g. fixed ordering, fixed truncation,
hash-based fingerprint in place of timestamps) so character-level changes don't
propagate to the user-message block.
Skip active-memory for cache-critical prompts: Allow callers to mark
"cache-critical" prefixes that bypass active-memory injection.
Tighten cache_control boundary: Make the Anthropic cache boundary respect
the prependContext vs system-prompt split so that stable system-prompt content
is cached independently of variable user-message content.
Workaround
Disable
active-memory(setenabled: false) and usememory_searchexplicitlywhen needed. Cache hit rate returns to ~99.9%.
Impact
Any user with
active-memoryenabled and an Anthropic-compatible provider thatsupports prompt caching will see cache hit rate collapse. This is silent (no
warning, no log entry) and undocumented. Common configuration, common failure mode.