How effective is Claude Code's prompt caching? Shows cache hit ratio and illustrative API cost savings across your session history.
cc-cache — Claude Code cache efficiency
Sessions: 755
Total tokens: 23.2B
Input (fresh): 8.7M tokens (0.0%)
Cache write: 783.7M tokens (3.4%)
Cache read: 22.4B tokens (96.5%)
Output: 20.1M tokens (0.1%)
Cache hit ratio: 96.6%
███████████████████████░
At API rates (illustrative):
Without cache: $293.1K
With cache: $41.6K
Saved: $251.5K (85.8% saved)
ⓘ Claude Max plan is flat-rate. Costs above are illustrative API equivalents.
npx cc-cache # Cache efficiency and illustrative savings
npx cc-cache --all # Include subagent session files
npx cc-cache --json # JSON output- Cache hit ratio — percentage of input tokens served from cache vs. freshly processed
- Token breakdown — input / cache write / cache read / output split
- Illustrative savings — what prompt caching would save at Claude API rates
- Monthly efficiency — month-by-month cache performance
- By project — which projects benefit most from caching
Claude Code on Max plan is flat-rate — you don't pay per token. The cost figures use Claude API Opus pricing as an illustrative reference to show the value of prompt caching.
The methodology:
- Without cache: all input tokens billed at full input rate
- With cache:
cache_readtokens at 10% of input rate,cache_writeat 125% of input rate
Reads session file content to extract token usage metadata. No content is transmitted. Everything runs locally.
Drop your ~/.claude folder into cc-cache on the web — no install required. Note: large histories (5+ GB) may take 1–2 minutes to process.
Part of cc-toolkit — 104 free tools for Claude Code