Skip to content

[Bug]: Prompt cache constantly invalidated — cacheWrite dominates over cacheRead, causing 10x cost increase #19989

@qonnectv

Description

@qonnectv

Summary

Since upgrading to 2026.2.17, prompt caching appears broken. The system prompt (~20k tokens) is re-written to cache on nearly every API call instead of being read from cache. This causes costs of $0.35-0.50 per call instead of the expected $0.02-0.04.

Evidence,
Same setup, same config, same workspace files:

Before update (2026.2.15, Feb 15-17):

  • Baseline: ~20k tokens,
  • Cost per call: $0.01-0.08,
  • cacheRead dominates after first call ,

After update (2026.2.17, Feb 18):

  • Baseline: ~29k tokens (+50%),
  • Cost per call: $0.04-0.51,
  • cacheWrite dominates — cache is constantly invalidated ,

Raw data from current session (last 5 calls):
in=87,447 out=0 cost=$0.5116 ← $0.51 for ZERO output
in=88,418 out=0 cost=$0.0607
in=89,061 out=0 cost=$0.5007 ← $0.50 for ZERO output
in=89,355 out=0 cost=$0.5095 ← $0.50 for ZERO output

Calls with output=0 and cost=$0.50 suggest full context is being sent and cache-written but producing nothing useful.

Environment,
OpenClaw: 2026.2.17 (upgraded from 2026.2.15 same day),
OS: Ubuntu 24.04 LTS (AWS EC2 t3.medium),
Model: anthropic/claude-opus-4-6 via OAuth (Claude MAX),
contextTokens: 400,000,
contextPruning: cache-ttl, 1h,
Workspace files: ~42KB injected as system prompt,

Possibly related,
2026.2.17 changelog: "Skills: compact skill file paths in system prompt" — could this change the prompt structure enough to invalidate caches?,
#17589 — Gateway restart retry bug (amplified cost),
#14543 — Compaction zero fault tolerance,

Steps to reproduce

Upgrade from 2026.2.15 to 2026.2.17,
Use anthropic/claude-opus-4-6 with OAuth (Claude MAX),
Chat normally in a Discord channel session,
Observe cacheWrite >> cacheRead in session JSONL usage entries,
Compare cost per call with pre-upgrade sessions,

Expected behavior

After the first call establishes the cache, subsequent calls should mostly use cacheRead for the system prompt (~20k tokens), with only incremental cacheWrite for new messages.

Actual behavior

Cache appears to be invalidated on most calls. cacheWrite of 50-60k tokens happens repeatedly, even when the system prompt hasn't changed. Cost per call is 10x higher than on 2026.2.15.

OpenClaw version

2026.2.17

Operating system

Ubuntu 24.04 LTS

Install method

No response

Logs, screenshots, and evidence

Impact and severity

No response

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions