Skip to content

Anthropic prompt cache misses on most turns in auto-reply path (high cacheWrite, zero cacheRead) #18963

@mooritzvc

Description

@mooritzvc

Note from the human: Hi, I investigated this in depth using Codex, iterating over all possible configs to validate that this is not an isolated issue that I introduced by making adjustments to settings.

Summary

Anthropic prompt caching appears to miss on most turns in auto-reply flows.

Observed pattern across many adjacent turns:

  • cacheWrite very high (~137k)
  • cacheRead = 0
  • small fresh input tokens

This causes repeated high-cost re-caching.

Environment

  • OpenClaw: 2026.2.16 (db3480f)
  • Provider: direct Anthropic (not OpenRouter)
  • Model: anthropic/claude-sonnet-4-5-20250929
  • Install: local CLI install
  • Channel observed: Telegram auto-reply (code path appears shared across other channel adapters too)

Relevant config

  • active Anthropic model has params.cacheRetention: "long"
  • agents.defaults.contextPruning.mode: "cache-ttl"
  • agents.defaults.contextPruning.ttl: "55m"
  • agents.defaults.contextPruning.minPrunableToolChars: 50000
  • agents.defaults.contextPruning.softTrim.maxChars: 4000

Expected

Within TTL, adjacent turns should show meaningful cacheRead for unchanged prompt-prefix segments.

Actual

Most turns show near-full cacheWrite and cacheRead=0.

Additional runtime signal

There are occasional cache-hit continuations immediately after tool/result boundaries, but normal subsequent user turns return to cacheRead=0 + large cacheWrite. That suggests caching is not globally disabled, but prefix reuse is unstable across regular turns.

Suspected cause

Auto-reply builds extra system prompt content from inbound metadata:

  • src/auto-reply/reply/get-reply-run.ts
  • src/auto-reply/reply/inbound-meta.ts

The inbound metadata includes volatile per-message fields:

  • message_id
  • reply_to_id
  • history_count

If these are part of the cached prefix, they can change every inbound turn and defeat cache reuse.

Related issues / PRs

Request

Please confirm whether inbound trusted metadata intended for routing/reactions should remain in system-prompt cached prefix, or be segmented/moved so prompt caching remains stable across normal adjacent turns.

Happy to test candidate fixes and report before/after cacheRead/cacheWrite.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions