-
-
Notifications
You must be signed in to change notification settings - Fork 53k
Description
Summary
Since upgrading to 2026.2.17, prompt caching appears broken. The system prompt (~20k tokens) is re-written to cache on nearly every API call instead of being read from cache. This causes costs of $0.35-0.50 per call instead of the expected $0.02-0.04.
Evidence,
Same setup, same config, same workspace files:
Before update (2026.2.15, Feb 15-17):
- Baseline: ~20k tokens,
- Cost per call: $0.01-0.08,
- cacheRead dominates after first call ,
After update (2026.2.17, Feb 18):
- Baseline: ~29k tokens (+50%),
- Cost per call: $0.04-0.51,
- cacheWrite dominates — cache is constantly invalidated ,
Raw data from current session (last 5 calls):
in=87,447 out=0 cost=$0.5116 ← $0.51 for ZERO output
in=88,418 out=0 cost=$0.0607
in=89,061 out=0 cost=$0.5007 ← $0.50 for ZERO output
in=89,355 out=0 cost=$0.5095 ← $0.50 for ZERO output
Calls with output=0 and cost=$0.50 suggest full context is being sent and cache-written but producing nothing useful.
Environment,
OpenClaw: 2026.2.17 (upgraded from 2026.2.15 same day),
OS: Ubuntu 24.04 LTS (AWS EC2 t3.medium),
Model: anthropic/claude-opus-4-6 via OAuth (Claude MAX),
contextTokens: 400,000,
contextPruning: cache-ttl, 1h,
Workspace files: ~42KB injected as system prompt,
Possibly related,
2026.2.17 changelog: "Skills: compact skill file paths in system prompt" — could this change the prompt structure enough to invalidate caches?,
#17589 — Gateway restart retry bug (amplified cost),
#14543 — Compaction zero fault tolerance,
Steps to reproduce
Upgrade from 2026.2.15 to 2026.2.17,
Use anthropic/claude-opus-4-6 with OAuth (Claude MAX),
Chat normally in a Discord channel session,
Observe cacheWrite >> cacheRead in session JSONL usage entries,
Compare cost per call with pre-upgrade sessions,
Expected behavior
After the first call establishes the cache, subsequent calls should mostly use cacheRead for the system prompt (~20k tokens), with only incremental cacheWrite for new messages.
Actual behavior
Cache appears to be invalidated on most calls. cacheWrite of 50-60k tokens happens repeatedly, even when the system prompt hasn't changed. Cost per call is 10x higher than on 2026.2.15.
OpenClaw version
2026.2.17
Operating system
Ubuntu 24.04 LTS
Install method
No response
Logs, screenshots, and evidence
Impact and severity
No response
Additional information
No response