Skip to content

📊 Copilot Token Usage Report2026-04-02 #1604

@github-actions

Description

@github-actions

Overview

Period: 2026-04-02T15:43Z to 2026-04-02T16:21Z (push-triggered runs on fix/openai-cache-token-tracking)
Runs analyzed: 4 Copilot-engine runs (3 had token data)
Total tokens: 786,861 across all instrumented workflows (449,616 billable)
Estimated total cost: $1.48

Major improvement vs. previous report: Cache hit rates jumped from 0% → 38–45% across all instrumented workflows. The fix/openai-cache-token-tracking branch appears to have fixed cache token reporting — caching is now correctly tracked and billing reflects actual cache discounts. Per-request billable tokens dropped ~1–2% compared to the previous report's first runs (where no cache was active yet).

Workflow Summary

Workflow Run Requests Total Tokens Billable Tokens Est. Cost Cache Rate I/O Ratio Top Model
Smoke Copilot §23908898227 5 374K 211K $0.70 43.7% 253:1 sonnet-4.6
Build Test Suite §23908898271 4 256K 142K $0.47 44.8% 233:1 sonnet-4.6
Smoke Chroot §23908898275 3 157K 97K $0.31 38.5% 305:1 sonnet-4.6

🔍 Optimization Opportunities

  1. All workflows — first request in each run has 0 cache reads

    • Requests 1 in all three workflows show 0 cache_read_tokens — the cache is cold on each fresh run
    • Subsequent requests within the same run benefit from the warm cache (38–45% hit rate)
    • Recommendation: If runs share a consistent system prompt, consider pre-warming the cache or using persistent cache keys across runs. Enabling cache_write_tokens tracking at the Copilot endpoint would also improve cost attribution.
  2. No cache_write_tokens reported in any workflow

    • All workflows show cache_write_tokens = 0, meaning we can only see cache reads but not writes
    • The Copilot provider may not return write counts separately from input tokens
    • Recommendation: Investigate whether the api-proxy can track cache writes via response headers or usage metadata for the Copilot inference endpoint. This would make cost attribution more accurate.
  3. High I/O ratios persist (233–305:1)

    • All workflows show very high input-to-output ratios, driven by large system prompts / tool schemas (31–43K input tokens/request vs 29–494 output tokens)
    • This is partially mitigated by caching, but the underlying context size is large
    • Recommendation: Review MCP tool surface in Smoke workflows — reducing tool schemas would lower input tokens for all requests including the cold-cache first request
Per-Workflow Details

Smoke Copilot

  • Run: §23908898227 (push to fix/openai-cache-token-tracking)
  • Requests: 5 (all claude-sonnet-4.6 via copilot provider)
  • Tokens: 374K total — 210K input, 1.5K output, 163K cache_read, 0 cache_write
  • Cache hit rate: 43.7%
  • Avg latency: 5,564ms/request
  • Estimated cost: $0.70
# Timestamp Input Output Cache Read Duration
1 2026-04-02 15:46:49 40,291 353 0 7,066ms
2 2026-04-02 15:46:57 41,351 331 36,518 5,297ms
3 2026-04-02 15:47:05 42,180 494 41,287 6,928ms
4 2026-04-02 15:47:11 42,816 266 42,179 5,723ms
5 2026-04-02 15:47:14 43,167 29 42,815 2,806ms

Build Test Suite

  • Run: §23908898271 (push to fix/openai-cache-token-tracking)
  • Requests: 4 (all claude-sonnet-4.6 via copilot provider)
  • Tokens: 256K total — 141K input, 1.1K output, 114K cache_read, 0 cache_write
  • Cache hit rate: 44.8%
  • Avg latency: 5,350ms/request
  • Estimated cost: $0.47
# Timestamp Input Output Cache Read Duration
1 2026-04-02 15:46:24 34,039 258 13,574 5,309ms
2 2026-04-02 15:46:39 34,559 520 30,138 6,758ms
3 2026-04-02 15:46:46 35,857 224 34,558 5,784ms
4 2026-04-02 15:46:49 36,098 89 35,856 3,548ms

Smoke Chroot

  • Run: §23908898275 (push to fix/openai-cache-token-tracking)
  • Requests: 3 (all claude-sonnet-4.6 via copilot provider)
  • Tokens: 157K total — 96K input, 513 output, 60K cache_read, 0 cache_write
  • Cache hit rate: 38.5%
  • Avg latency: 5,724ms/request
  • Estimated cost: $0.31
# Timestamp Input Output Cache Read Duration
1 2026-04-02 15:48:12 31,636 221 0 5,187ms
2 2026-04-02 15:48:17 32,151 226 28,170 4,716ms
3 2026-04-02 15:48:25 32,394 66 32,150 7,270ms
Workflows Without Token Data

The following Copilot-engine workflows ran in the past 24 hours but had no agent-artifacts with token data:

Workflow Run ID Reason
Smoke Services §23908898245 Has agent artifact but no agent-artifacts — likely doesn't use --enable-api-proxy
Daily Copilot Token Usage Analyzer §23908784491 Previous analyzer run (this workflow itself)

Historical Trend

Comparison to previous report #1591 (2026-04-02, 01:06 UTC window):

Workflow Prev Cache Rate Curr Cache Rate Prev Billable/Req Curr Billable/Req Change
Smoke Copilot 0% 43.7% 42,800 42,256 -1%
Build Test Suite 0% 44.8% 36,250 35,411 -2%
Secret Digger (Copilot) 0% (no run this period)

The cache hit rate improvement from 0% to ~40–45% is the direct result of the fix/openai-cache-token-tracking fix correctly parsing prompt_tokens_details.cached_tokens from OpenAI/Copilot API responses. Per-request billable tokens are essentially unchanged (within 1–2%), confirming the cache was not newly enabled but was already active and now being correctly measured.

Previous Report

#1591 — 📊 Copilot Token Usage Report 2026-04-02

References:

Generated by Daily Copilot Token Usage Analyzer ·

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions