fix: extract OpenAI/Copilot cached_tokens from prompt_tokens_details#1603
fix: extract OpenAI/Copilot cached_tokens from prompt_tokens_details#1603
Conversation
The token tracker only extracted Anthropic-style cache fields (cache_read_input_tokens, cache_creation_input_tokens) but missed the OpenAI/Copilot format where cache info is nested under usage.prompt_tokens_details.cached_tokens. This caused token-usage.jsonl to report cache_read_tokens: 0 for all Copilot API requests, even when the API was returning significant cache hits (e.g., 43,894 of 43,977 prompt tokens cached). Fix both extractUsageFromJson() and extractUsageFromSseLine() to read prompt_tokens_details.cached_tokens and map it to the normalized cache_read_input_tokens field. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
✅ Coverage Check PassedOverall Coverage
📁 Per-file Coverage Changes (1 files)
Coverage comparison generated by |
There was a problem hiding this comment.
Pull request overview
Fixes api-proxy token tracking for OpenAI/Copilot prompt caching by extracting usage.prompt_tokens_details.cached_tokens and surfacing it through the existing normalized cache fields so token-usage.jsonl reflects cache hits correctly.
Changes:
- Extract
usage.prompt_tokens_details.cached_tokensin both non-streaming JSON and streaming SSE usage parsing, mapping it tocache_read_input_tokens. - Update usage-format JSDoc examples to include the nested OpenAI/Copilot cache field.
- Add tests covering JSON extraction, SSE final-chunk extraction, and normalization behavior for OpenAI/Copilot cached tokens.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| containers/api-proxy/token-tracker.js | Adds extraction of OpenAI/Copilot nested cached-token usage fields for both JSON and SSE parsing paths. |
| containers/api-proxy/token-tracker.test.js | Adds targeted regression tests ensuring cached tokens are extracted and normalized as expected. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // OpenAI/Copilot nested cache fields (prompt_tokens_details.cached_tokens) | ||
| if (json.usage.prompt_tokens_details && typeof json.usage.prompt_tokens_details.cached_tokens === 'number') { | ||
| usage.cache_read_input_tokens = json.usage.prompt_tokens_details.cached_tokens; | ||
| hasField = true; | ||
| } |
There was a problem hiding this comment.
normalizeUsage()’s JSDoc (later in this file) still states cache_read_tokens/cache_write_tokens are “Anthropic only, 0 for others”, but this change now populates cache_read_input_tokens from OpenAI/Copilot prompt_tokens_details.cached_tokens, which will result in non-zero cache_read_tokens for OpenAI/Copilot too. Please update the normalization doc comment to reflect the new supported source(s) so future readers don’t assume cache fields are Anthropic-only.
|
Smoke Test Results — ✅ GitHub MCP: PR #1593 "fix: capture full session state…" / PR #1588 "fix: rename and scope token analyzer…" Overall: PASS
|
|
Smoke test results (run ✅ GitHub MCP: #1593 fix: capture full session state, #1581 feat: add esbuild single-file bundle Overall: PASS
|
This comment has been minimized.
This comment has been minimized.
Chroot Version Comparison Results
Result: ❌ Not all tests passed — Python and Node.js versions differ between host and chroot environments.
|
This comment has been minimized.
This comment has been minimized.
🏗️ Build Test Suite Results
Overall: 8/8 ecosystems passed — ✅ PASS
|
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Smoke test results ✅ GitHub MCP: #1593 fix: capture full session state, #1581 feat: add esbuild bundle Overall: PASS
|
Smoke Test Results — Copilot Engine
Overall: PASS Author:
|
Chroot Version Comparison Results
Overall: ❌ FAILED — Python and Node.js versions differ between host and chroot environments.
|
|
Smoke test results for workflow run 23908898315
Overall status: FAIL
|
Smoke Test: GitHub Actions Services Connectivity ✅All 3 connectivity checks passed:
|
Problem
The api-proxy token tracker only extracted Anthropic-style cache fields (
cache_read_input_tokens,cache_creation_input_tokens) but missed the OpenAI/Copilot format where cache info is nested underusage.prompt_tokens_details.cached_tokens.This caused
token-usage.jsonlto reportcache_read_tokens: 0for all Copilot API requests, even when the upstream API was returning significant cache hits.Evidence from smoke-copilot run 23878258933
What
token-usage.jsonlreported (api-proxy):What
agent.logshowed (from actual API responses):Prompt caching was saving ~88% of input tokens, but the tracker was blind to it.
Root Cause
The OpenAI/Copilot API returns cache info in a nested format:
{ "usage": { "prompt_tokens": 41344, "completion_tokens": 256, "prompt_tokens_details": { "cached_tokens": 36500 } } }Both
extractUsageFromJson()andextractUsageFromSseLine()only read flatprompt_tokens/completion_tokens/total_tokens— they never looked insideprompt_tokens_details. ThenormalizeUsage()function then fell back tocache_read_input_tokens ?? 0(the Anthropic field name) which was never set.Fix
usage.prompt_tokens_details.cached_tokensin bothextractUsageFromJson()andextractUsageFromSseLine()cache_read_input_tokenssonormalizeUsage()picks it up correctlyTesting
All 192 api-proxy tests pass (
cd containers/api-proxy && npm test).New tests:
extractUsageFromJson→ extractsprompt_tokens_details.cached_tokensextractUsageFromJson→ handles responses withoutprompt_tokens_detailsextractUsageFromSseLine→ extracts cache tokens from streaming final chunknormalizeUsage→ normalizes OpenAI cache tokens viacache_read_input_tokensmapping