fix: session_status context tracking undercount for cached providers#22387
Closed
1ucian wants to merge 1 commit intoopenclaw:mainfrom
Closed
fix: session_status context tracking undercount for cached providers#223871ucian wants to merge 1 commit intoopenclaw:mainfrom
1ucian wants to merge 1 commit intoopenclaw:mainfrom
Conversation
Two changes that fix inaccurate context reporting in the session_status tool: 1. Enable transcript-based usage lookup for session_status tool (session-status-tool.ts: includeTranscriptUsage false → true) The session entry's totalTokens is only updated at the END of each turn. When session_status is called mid-turn (which is always — it's a tool), it reads a stale value from the previous turn. With includeTranscriptUsage enabled, it falls back to the session transcript which has the actual API response usage including cache_read tokens. For Anthropic models with prompt caching, the true context size is input + cacheRead + cacheWrite (via derivePromptTokens). Without transcript lookup, session_status shows the previous turn's snapshot which can be dramatically smaller (e.g. 28k vs 185k actual). 2. Optimize readUsageFromSessionLog to read only the tail of the log (status.ts: read last 8KB instead of entire file) Since we only need the last usage entry, reading the full session log (which can grow to many MB) is wasteful. The tail-read approach reads the final 8KB and skips the first potentially partial line. Fixes context tracking undercount for Anthropic/Claude models where prompt caching causes usage.input to be very small (often 1-10 tokens) while cache_read_input_tokens holds the actual context size. Related: openclaw#17799, openclaw#16079
Contributor
|
Land note: this is now covered on Implemented via
Behavior/result: status usage is accurate for cache-heavy providers during active turns and avoids expensive whole-file reads. Changelog credit is included ("Thanks @1ucian"). |
Contributor
|
Closing as covered by cf38339 on main. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
session_statusreports dramatically undercounted context usage. For example, showing28k/1.0m (3%)when actual context is185k/1.0m (19%).This affects all providers with prompt caching (Anthropic Claude especially), where
usage.inputis very small (1-10 tokens for cache hits) whilecache_read_input_tokensholds the actual context size.Root Cause
Two issues compound:
session-status-tool.tssetsincludeTranscriptUsage: false— The session entry'stotalTokensis updated at the END of each turn, butsession_statusruns MID-turn (it's a tool call). So it always reads the previous turn's stale snapshot. The transcript has the real usage from the last API response.Performance concern —
readUsageFromSessionLogreads the entire session log file. For long sessions this is wasteful since we only need the last usage entry.Fix
Enable
includeTranscriptUsage: truefor the session_status tool so it falls back to transcript-derived usage when the session entry is stale.Optimize
readUsageFromSessionLogto read only the last 8KB of the log file (tail read) instead of the entire file, mitigating the performance concern.Verification
Before fix:
Context: 28k/1.0m (3%)Session log shows:
cacheRead: 184186, input: 1, cacheWrite: 707→ actual context ~185kSession store shows:
totalTokens: 165862(from previous turn)After fix:
derivePromptTokenscorrectly sumsinput + cacheRead + cacheWrite= ~185kRelated issues: #17799, #16079
Greptile Summary
Fixed
session_statusseverely undercounting context usage for providers with prompt caching (particularly Anthropic Claude). The issue had two parts:Stale snapshot problem:
session_statusruns mid-turn as a tool call, so it reads the session entry before it's updated with current turn's usage. The fix enablesincludeTranscriptUsage: trueinsession-status-tool.ts:384to fall back to transcript-derived usage which includes cache tokens.Performance optimization: Changed
readUsageFromSessionLogto read only the last 8KB of the log file (tail read) instead of the entire file, preventing performance degradation on long sessions.The fix correctly addresses the root cause by ensuring
derivePromptTokenssumsinput + cacheRead + cacheWritetokens rather than just using the staleinputcount. The tail read optimization uses proper offset calculation and handles edge cases (small files, partial lines) correctly.Confidence Score: 4/5
Last reviewed commit: e1343e7
(2/5) Greptile learns from your feedback when you react with thumbs up/down!