Skip to content

fix: session_status context tracking undercount for cached providers#22387

Closed
1ucian wants to merge 1 commit intoopenclaw:mainfrom
1ucian:fix/session-status-context-tracking
Closed

fix: session_status context tracking undercount for cached providers#22387
1ucian wants to merge 1 commit intoopenclaw:mainfrom
1ucian:fix/session-status-context-tracking

Conversation

@1ucian
Copy link
Contributor

@1ucian 1ucian commented Feb 21, 2026

Problem

session_status reports dramatically undercounted context usage. For example, showing 28k/1.0m (3%) when actual context is 185k/1.0m (19%).

This affects all providers with prompt caching (Anthropic Claude especially), where usage.input is very small (1-10 tokens for cache hits) while cache_read_input_tokens holds the actual context size.

Root Cause

Two issues compound:

  1. session-status-tool.ts sets includeTranscriptUsage: false — The session entry's totalTokens is updated at the END of each turn, but session_status runs MID-turn (it's a tool call). So it always reads the previous turn's stale snapshot. The transcript has the real usage from the last API response.

  2. Performance concernreadUsageFromSessionLog reads the entire session log file. For long sessions this is wasteful since we only need the last usage entry.

Fix

  1. Enable includeTranscriptUsage: true for the session_status tool so it falls back to transcript-derived usage when the session entry is stale.

  2. Optimize readUsageFromSessionLog to read only the last 8KB of the log file (tail read) instead of the entire file, mitigating the performance concern.

Verification

Before fix: Context: 28k/1.0m (3%)
Session log shows: cacheRead: 184186, input: 1, cacheWrite: 707 → actual context ~185k
Session store shows: totalTokens: 165862 (from previous turn)

After fix: derivePromptTokens correctly sums input + cacheRead + cacheWrite = ~185k

Related issues: #17799, #16079

Greptile Summary

Fixed session_status severely undercounting context usage for providers with prompt caching (particularly Anthropic Claude). The issue had two parts:

  1. Stale snapshot problem: session_status runs mid-turn as a tool call, so it reads the session entry before it's updated with current turn's usage. The fix enables includeTranscriptUsage: true in session-status-tool.ts:384 to fall back to transcript-derived usage which includes cache tokens.

  2. Performance optimization: Changed readUsageFromSessionLog to read only the last 8KB of the log file (tail read) instead of the entire file, preventing performance degradation on long sessions.

The fix correctly addresses the root cause by ensuring derivePromptTokens sums input + cacheRead + cacheWrite tokens rather than just using the stale input count. The tail read optimization uses proper offset calculation and handles edge cases (small files, partial lines) correctly.

Confidence Score: 4/5

  • This PR is safe to merge with low risk - it fixes a critical bug in usage reporting with minimal code changes and good test coverage
  • The changes are well-targeted and solve a real bug. The tail read optimization is sound with proper edge case handling (offset calculation, partial line skipping, buffer size handling). Existing tests verify the fix works correctly. Minor deduction because the tail read introduces a theoretical edge case if log files have extremely long lines (>8KB), though this is very unlikely in practice given the JSONL format.
  • No files require special attention

Last reviewed commit: e1343e7

(2/5) Greptile learns from your feedback when you react with thumbs up/down!

Two changes that fix inaccurate context reporting in the session_status tool:

1. Enable transcript-based usage lookup for session_status tool
   (session-status-tool.ts: includeTranscriptUsage false → true)

   The session entry's totalTokens is only updated at the END of each turn.
   When session_status is called mid-turn (which is always — it's a tool),
   it reads a stale value from the previous turn. With includeTranscriptUsage
   enabled, it falls back to the session transcript which has the actual
   API response usage including cache_read tokens.

   For Anthropic models with prompt caching, the true context size is
   input + cacheRead + cacheWrite (via derivePromptTokens). Without
   transcript lookup, session_status shows the previous turn's snapshot
   which can be dramatically smaller (e.g. 28k vs 185k actual).

2. Optimize readUsageFromSessionLog to read only the tail of the log
   (status.ts: read last 8KB instead of entire file)

   Since we only need the last usage entry, reading the full session log
   (which can grow to many MB) is wasteful. The tail-read approach reads
   the final 8KB and skips the first potentially partial line.

Fixes context tracking undercount for Anthropic/Claude models where
prompt caching causes usage.input to be very small (often 1-10 tokens)
while cache_read_input_tokens holds the actual context size.

Related: openclaw#17799, openclaw#16079
@openclaw-barnacle openclaw-barnacle bot added agents Agent runtime and tooling size: XS labels Feb 21, 2026
@steipete
Copy link
Contributor

Land note: this is now covered on main.

Implemented via cf38339f2:

  • session_status now includes transcript-derived usage mid-turn (includeTranscriptUsage: true).
  • Session usage file read path now tail-reads only the end of the file (8KB) instead of full-file reads.
  • Tail parser handles partial first line safely.

Behavior/result: status usage is accurate for cache-heavy providers during active turns and avoids expensive whole-file reads.

Changelog credit is included ("Thanks @1ucian").

@steipete
Copy link
Contributor

Closing as covered by cf38339 on main.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling size: XS

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants