Skip to content

[WIP] Create daily token usage analysis workflow#1553

Closed
Copilot wants to merge 1 commit intomainfrom
copilot/feat-daily-token-usage-analysis
Closed

[WIP] Create daily token usage analysis workflow#1553
Copilot wants to merge 1 commit intomainfrom
copilot/feat-daily-token-usage-analysis

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 1, 2026

Thanks for asking me to work on this. I will get started on it and keep this PR's description up to date as I form a plan and make progress.


This section details on the original issue you should resolve

<issue_title>feat: daily token usage analysis workflow</issue_title>
<issue_description>## Summary

Create a daily agentic workflow that mines token usage data from recent workflow runs, identifies trends and inefficiencies, and creates a summary issue with findings and optimization recommendations.

Background

PR #1539 introduced token usage tracking in the api-proxy sidecar. Each workflow run with --enable-api-proxy now produces a token-usage.jsonl artifact containing per-request records with:

{"timestamp":"...","request_id":"...","provider":"anthropic","model":"claude-sonnet-4-6","path":"/v1/messages?beta=true","status":200,"streaming":true,"input_tokens":3,"output_tokens":418,"cache_read_tokens":14044,"cache_write_tokens":26042,"duration_ms":5858,"response_bytes":2800}

This data is captured in the agent-artifacts upload (under sandbox/firewall/logs/api-proxy-logs/token-usage.jsonl) but is not yet analyzed systematically.

Proposed Workflow

Trigger

  • Daily schedule (e.g., 08:00 UTC)
  • workflow_dispatch for manual runs

Behavior

  1. Discover runs since last analysis

    • Use gh run list to find completed agentic workflow runs since the last analysis issue
    • Cover all smoke tests, secret-digger, and other workflows that use --enable-api-proxy
  2. Download and aggregate token-usage.jsonl

    • For each run, download the agent-artifacts artifact
    • Extract token-usage.jsonl from the artifact
    • Gracefully handle runs where token logs are not available (this is a new feature — older runs and runs without --enable-api-proxy won't have logs)
  3. Compute per-workflow statistics

    • Total tokens (input + output + cache read + cache write)
    • Input/output ratio
    • Cache hit rate (cache_read / (cache_read + input))
    • Average request count per run
    • Average duration per request
    • Model mix (which models are used, in what proportions)
  4. Identify trends

    • Compare current period vs previous period (if historical data available)
    • Flag workflows with increasing token consumption
    • Flag workflows with low cache hit rates (opportunity for prompt caching)
    • Flag workflows with high input/output ratios (may indicate bloated prompts or tool schemas)
  5. Generate optimization recommendations

    • Workflows with zero cache hits → suggest enabling prompt caching
    • Workflows with >100:1 input/output ratio → suggest reducing system prompt or tool surface
    • Workflows with many small requests → suggest batching
    • Workflows with unusually high total tokens → flag for review
  6. Create a summary issue

    • Title: Token Usage Report: YYYY-MM-DD
    • Body includes:
      • Overview table (workflow name, total tokens, cost estimate, cache rate)
      • Per-workflow detail sections (collapsible)
      • Historical trend chart (if prior reports exist)
      • Top optimization opportunities
      • Link to previous report issue (if any)
    • Label: token-usage-report

Graceful Degradation

Since token tracking is new:

  • Skip runs that have no agent-artifacts artifact
  • Skip runs where token-usage.jsonl is missing or empty
  • Report which workflows are not yet instrumented
  • Don't fail the workflow if no data is available — create a minimal report noting the gap

Tools Needed

  • github MCP server — for listing runs, downloading artifacts, creating issues
  • bash — for processing JSONL files, computing aggregates
  • web-fetch — not needed (all data is in GitHub artifacts)

Permissions

  • actions: read — to list workflow runs and download artifacts
  • issues: write — to create summary issues

Token Usage Record Schema

Field Type Description
timestamp string ISO 8601 timestamp
request_id string Unique request ID
provider string anthropic, openai, copilot, opencode
model string Model name (e.g., claude-sonnet-4-6)
path string API endpoint path
status number HTTP status code
streaming boolean Whether response was SSE streamed
input_tokens number Input/prompt tokens
output_tokens number Output/completion tokens
cache_read_tokens number Prompt cache read tokens
cache_write_tokens number Prompt cache write tokens
duration_ms number Request duration in milliseconds
response_bytes number Response body size

Example Analysis (from smoke-claude run)

From a recent smoke-claude run (6 requests):

  • Total tokens: 197,558 (775 input, 1,227 output, 152K cache read, 43K cache write)
  • Cache hit rate: 99.5%
  • Models: claude-haiku-4.5 (routing), claude-sonnet-4.6 (main agent)
  • Estimated cost: ~$0.23
  • Key finding: Anthropic prompt caching is working well (63% cost savings)

Related

<agent_instructions>Create a PR that addresses this issue</agent_instructions>

Comments on the Issue (you are @copilot in this section)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: daily token usage analysis workflow

2 participants