Daily Claude Token Usage Analyzer #6

Agentic Workflow file for this run

.github/workflows/claude-token-usage-analyzer.md at bc01736

	---
	description: Daily Claude token usage analysis across agentic workflow runs — identifies trends, inefficiencies, and optimization opportunities
	on:
	schedule: daily
	workflow_dispatch:
	permissions:
	contents: read
	actions: read
	issues: read
	pull-requests: read
	imports:
	- shared/mcp-pagination.md
	- shared/reporting.md
	network:
	allowed:
	- github
	- "*.blob.core.windows.net"
	tools:
	github:
	toolsets: [default, actions]
	bash: true
	safe-outputs:
	create-issue:
	title-prefix: "📊 Claude Token Usage Report"
	labels: [claude-token-usage-report]
	close-older-issues: true
	timeout-minutes: 15
	---

	# Daily Claude Token Usage Analyzer

	You are an AI agent that analyzes Claude token usage across agentic workflow runs in this repository. Your goal is to identify trends, highlight inefficiencies, and recommend optimizations to reduce AI inference costs for Claude-engine workflows.

	## Background

	This repository uses the Agent Workflow Firewall (AWF) with an api-proxy sidecar that tracks token usage for LLM API calls. Each workflow run with `--enable-api-proxy` produces a `token-usage.jsonl` file captured in the `agent-artifacts` upload artifact.

	Token usage tracking is a new feature — many older runs won't have this data. Handle missing data gracefully.

	### Token Usage Record Format

	Each line in `token-usage.jsonl` is a JSON object:
	```json
	{
	"timestamp": "2026-04-01T17:38:12.486Z",
	"request_id": "uuid",
	"provider": "anthropic",
	"model": "claude-sonnet-4-6",
	"path": "/v1/messages?beta=true",
	"status": 200,
	"streaming": true,
	"input_tokens": 3,
	"output_tokens": 418,
	"cache_read_tokens": 14044,
	"cache_write_tokens": 26042,
	"duration_ms": 5858,
	"response_bytes": 2800
	}
	```

	## Your Mission

	### Step 1: Discover Recent Workflow Runs

	Use `gh run list` via bash to find completed agentic workflow runs from the past 24 hours (or since the last token usage report issue). Focus on Claude-engine workflows that use the api-proxy:

	- `smoke-claude`
	- `secret-digger-claude`
	- `security-review`, `security-guard`
	- Any other Claude-engine workflow with `agent-artifacts`

	Note: Copilot-engine and Codex-engine workflows (e.g., `smoke-copilot`, `smoke-chroot`, `smoke-services`, `build-test`, `smoke-codex`, `secret-digger-copilot`) are excluded from this analysis — they are covered by the separate Copilot Token Usage Analyzer.

	Use bash to run:
	```bash
	# Find runs from the last 24 hours
	CUTOFF="$(date -u -Iseconds -d '24 hours ago')"

	gh run list --repo "$GITHUB_REPOSITORY" --limit 50 \
	--created ">=$CUTOFF" \
	--json databaseId,name,status,conclusion,createdAt,workflowName \
	--jq '[.[] \| select(.conclusion == "success" or .conclusion == "failure")]'
	```

	### Step 2: Download and Parse Token Usage Data

	For each discovered run, attempt to download the `agent-artifacts` artifact and extract `token-usage.jsonl`.

	IMPORTANT: Always use `gh run download` via bash — this is much faster than using MCP `get_job_logs` and the network is configured to allow artifact blob storage access.

	```bash
	# Create temp directory
	TMPDIR=$(mktemp -d)

	# Try to download artifacts for a run
	gh run download <RUN_ID> --repo "$GITHUB_REPOSITORY" --name agent-artifacts --dir "$TMPDIR/run-<RUN_ID>" 2>/dev/null

	# Look for token-usage.jsonl (may be nested under sandbox/firewall/logs/api-proxy-logs/)
	find "$TMPDIR/run-<RUN_ID>" -name "token-usage.jsonl" 2>/dev/null
	```

	Filter for Claude/Anthropic requests: When parsing `token-usage.jsonl`, only include records where `provider` is `"anthropic"`. This ensures we're analyzing Claude-specific usage even if a workflow makes calls to multiple providers.

	Graceful degradation:
	- If artifact download fails → skip run, note it as "no artifacts"
	- If `token-usage.jsonl` is missing → skip run, note it as "no token logs"
	- If the file is empty → skip run, note it as "empty token logs"
	- Track which workflows have instrumentation vs which don't

	### Step 3: Compute Per-Workflow Statistics

	For each workflow that has token data, calculate:

	1. Total tokens: `input_tokens + output_tokens + cache_read_tokens + cache_write_tokens`
	2. Full-price tokens: `input_tokens + output_tokens + cache_write_tokens` (excludes discounted cache reads; use Estimated cost below for true billed amount)
	3. Input/output ratio: `(input_tokens + cache_read_tokens) / output_tokens` (if `output_tokens == 0`, treat the ratio as `∞`/`N/A` and exclude that request from ratio averages to avoid division by zero)
	4. Cache hit rate: `cache_read_tokens / (cache_read_tokens + input_tokens) * 100`
	5. Cache write rate: `cache_write_tokens / (cache_read_tokens + input_tokens + cache_write_tokens) * 100`
	6. Request count: Number of records in the JSONL
	7. Average latency: Mean `duration_ms` per request
	8. Model distribution: Count of requests per model
	9. Estimated cost (sum all four token types at their respective rates — cache reads are discounted, not free):
	- Anthropic Sonnet: input $3/M, output $15/M, cache_read $0.30/M, cache_write $3.75/M
	- Anthropic Haiku: input $0.80/M, output $4/M, cache_read $0.08/M, cache_write $1/M
	- Anthropic Opus: input $15/M, output $75/M, cache_read $1.50/M, cache_write $18.75/M

	Use bash with a Python or jq script to process the JSONL files efficiently.

	### Step 4: Identify Optimization Opportunities

	Flag workflows with these patterns:

	\| Pattern \| Threshold \| Recommendation \|
	\|---------\|-----------\|----------------\|
	\| Zero cache hits \| cache_hit_rate = 0% \| Enable prompt caching \|
	\| Low cache hits \| cache_hit_rate < 50% \| Review cache breakpoints \|
	\| High cache write vs read \| cache_write > cache_read \| Cache is being written but not reused — check if conversation turns are too short \|
	\| High input/output ratio \| ratio > 100:1 \| Reduce system prompt or MCP tool surface \|
	\| Many small requests \| >10 requests, <50 output tokens each \| Batch requests or combine tool calls \|
	\| High total cost \| >$1.00 per run \| Review if workflow is doing too much \|
	\| Increasing trend \| >20% increase vs last report \| Investigate what changed \|

	### Step 5: Check for Historical Trends

	Search for previous token usage report issues:
	```bash
	gh issue list --repo "$GITHUB_REPOSITORY" --label claude-token-usage-report --state all --limit 5 --json number,title,createdAt,url
	```

	If previous reports exist, compare current metrics to identify:
	- Workflows with increasing token consumption
	- Workflows that gained or lost prompt caching
	- New workflows that started using the api-proxy
	- Cost trend over time

	### Step 6: Create the Summary Issue

	Create an issue with the following structure:

	#### Title: `YYYY-MM-DD` (safe-outputs will automatically prefix this with "📊 Claude Token Usage Report")

	#### Body structure:

	```markdown
	### Overview

	Period: [start date] to [end date]
	Runs analyzed: X of Y (Z had token data)
	Total tokens: N across all workflows
	Estimated total cost: $X.XX

	### Workflow Summary

	\| Workflow \| Runs \| Total Tokens \| Cost \| Cache Rate \| I/O Ratio \| Top Model \|
	\|----------\|------\|-------------\|------\|------------\|-----------\|-----------\|
	\| smoke-claude \| 2 \| 395K \| $0.46 \| 99.5% \| 0.6:1 \| sonnet-4.6 \|
	\| security-review \| 1 \| 120K \| $0.22 \| 85% \| 2.3:1 \| sonnet-4.6 \|
	\| ... \| \| \| \| \| \| \|

	### 🔍 Optimization Opportunities

	1. secret-digger-claude — 45% cache hit rate, high cache writes
	- Cache is being created but not fully reused across turns
	- Consider restructuring the prompt to maximize cache prefix reuse

	2. ...

	<details>
	<summary><b>Per-Workflow Details</b></summary>

	#### smoke-claude
	- Runs: 2 (run 123, run 456)
	- Requests: 12 total (avg 6/run)
	- Models: claude-haiku-4.5 (4 reqs), claude-sonnet-4.6 (8 reqs)
	- Tokens: 395K total (1.5K input, 2.5K output, 304K cache_read, 87K cache_write)
	- Cache hit rate: 99.5%
	- Cache write rate: 22.3%
	- Avg latency: 3,800ms/request
	- Estimated cost: $0.46

	#### security-review
	...

	</details>

	<details>
	<summary><b>Workflows Without Token Data</b></summary>

	The following workflows either don't use `--enable-api-proxy` or ran before token tracking was enabled:
	- secret-digger-claude (1 run — no agent-artifacts)

	</details>

	### Historical Trend

	[If previous reports exist, show comparison. Otherwise note: "This is the first Claude token usage report. Historical trends will be available in future reports."]

	### Previous Report
	[Link to previous report issue if one exists, otherwise omit this section]
	```

	## Important Guidelines

	- Time budget — You have 15 minutes total. Spend at most 8 minutes on data collection (steps 1-2) and reserve the rest for analysis and issue creation. If artifact downloads are slow, limit to the 5 most recent runs.
	- Prefer bash over MCP for data collection — `gh run download` via bash is much faster than MCP `get_job_logs` for retrieving artifacts.
	- Do NOT fail if no token data is available. Create a minimal report explaining that token tracking is new and which workflows need instrumentation.
	- Clean up temporary directories after processing.
	- Respect rate limits — download artifacts one at a time, not in parallel.
	- Use `--perPage` parameters when listing runs to avoid token limits on MCP responses.
	- Wrap verbose output in `<details>` blocks for progressive disclosure.
	- Round costs to 2 decimal places, token counts to nearest thousand for readability.
	- Sort workflows by estimated cost (highest first) in the summary table.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Daily Claude Token Usage Analyzer #6

Agentic workflow file

Daily Claude Token Usage Analyzer #6

Uh oh!

Agentic Workflow file for this run