-
-
Notifications
You must be signed in to change notification settings - Fork 52.6k
Description
Summary
OpenClaw currently does not persist or expose Anthropic's prompt cache metrics (cache_read_input_tokens, cache_creation_input_tokens) anywhere accessible to users or operators. These metrics are returned by the API on every call but are silently discarded.
Motivation
With the introduction of v2026.2.17 and larger system prompts (skills, workspace injection), prompt cache efficiency has become a meaningful cost and performance factor. Issue #19989 demonstrates a real-world case where cache invalidation caused a 10x cost increase that went undetected for multiple calls — because there was no way to observe cache behavior.
Currently:
openclaw sessions list --jsonexposesinputTokens,outputTokens,totalTokens— but no cache breakdown- Gateway logs contain no cache metrics
openclaw memory statusandopenclaw models statushave no cache information- There is no way to determine whether the system prompt is being cache-hit or re-written
Proposed Changes
1. Session Store — add cache fields
{
"key": "agent:main:main",
"inputTokens": 8,
"outputTokens": 1795,
"totalTokens": 149291,
"cacheReadTokens": 120000,
"cacheWriteTokens": 5000,
"cacheHitRate": 0.96
}2. Gateway logs — include cache stats per request
[agent/main] turn complete input=8k output=1.8k cacheRead=120k cacheWrite=5k hitRate=96%
3. openclaw sessions list output
Show cache efficiency alongside token counts:
agent:main:main claude-sonnet-4-6 149k tokens cache: 96% hit
4. Optional: warn on low cache hit rate
If cacheRead / (cacheRead + cacheWrite) < 0.5 for more than N consecutive calls, emit a WARN log suggesting cache invalidation may be occurring.
Impact
- Enables operators to detect cache regressions immediately (cf. [Bug]: Prompt cache constantly invalidated — cacheWrite dominates over cacheRead, causing 10x cost increase #19989)
- Provides visibility into cost efficiency per session
- Useful for debugging system prompt changes that break caching
- Low implementation cost — data is already present in API responses, just needs to be forwarded
Providers
Initially Anthropic only (they expose cache_read_input_tokens / cache_creation_input_tokens). Can be extended to other providers as they add cache support.