Comprehensive token usage analysis and cost tracking for OpenCode AI sessions
Track and optimize your token usage across system prompts, user messages, tool outputs, and more. Get detailed breakdowns, accurate cost estimates, and visual insights for your AI development workflow.
-
Install globally:
npm install -g @ramtinj95/opencode-tokenscope
-
Add to your
opencode.json(create one in your project root or~/.config/opencode/opencode.jsonfor global config):{ "$schema": "https://opencode.ai/config.json", "plugin": ["@ramtinj95/opencode-tokenscope"] } -
Create the
/tokenscopecommand by creating~/.config/opencode/command/tokenscope.md:
mkdir -p ~/.config/opencode/command
cat > ~/.config/opencode/command/tokenscope.md << 'EOF'
---
description: Analyze token usage across the current session with detailed breakdowns by category
---
Call the tokenscope tool directly without delegating to other agents.
Then cat the token-usage-output.txt. DONT DO ANYTHING ELSE WITH THE OUTPUT.
EOF- Restart OpenCode and run
/tokenscope
To always get the latest version automatically, use @latest:
{
"plugin": ["@ramtinj95/opencode-tokenscope@latest"]
}curl -sSL https://raw.githubusercontent.com/ramtinJ95/opencode-tokenscope/main/plugin/install.sh | bashThen restart OpenCode and run /tokenscope
Config in opencode.json |
Behavior |
|---|---|
"@ramtinj95/opencode-tokenscope" |
Uses the version installed at install time. Never auto-updates. |
"@ramtinj95/opencode-tokenscope@latest" |
Fetches latest version every time OpenCode starts. |
"@ramtinj95/opencode-tokenscope@1.4.0" |
Pins to exact version 1.4.0. Never updates. |
To manually update:
npm update -g @ramtinj95/opencode-tokenscopeOr use @latest in your opencode.json to auto-update on OpenCode restart.
Option 1: Local script (if you have the plugin installed)
bash ~/.config/opencode/plugin/install.sh --updateOption 2: Remote script (always works)
curl -sSL https://raw.githubusercontent.com/ramtinJ95/opencode-tokenscope/main/plugin/install.sh | bash -s -- --updateThe --update flag skips dependency installation for faster updates.
Simply type in OpenCode:
/tokenscope
The plugin will:
- Analyze the current session
- Count tokens across all categories
- Analyze all subagent (Task tool) child sessions recursively
- Calculate costs based on API telemetry
- Save detailed report to
token-usage-output.txt
- sessionID: Analyze a specific session instead of the current one
- limitMessages: Limit entries shown per category (1-10, default: 3)
- includeSubagents: Include subagent child session costs (default: true)
cat token-usage-output.txt- 5 Category Breakdown: System prompts, user messages, assistant responses, tool outputs, and reasoning traces
- Visual Charts: Easy-to-read ASCII bar charts with percentages and token counts
- Smart Inference: Automatically infers system prompts from API telemetry (since they're not exposed in session messages)
- System Prompt Components: See token distribution across base prompt, tool definitions, environment context, project tree, and custom instructions
- Automatic Estimation: Estimates breakdown from
cache_writetokens when system prompt content isn't directly available - Tool Count: Shows how many tools are loaded and their combined token cost
- Per-Tool Estimates: Lists all enabled tools with estimated schema token costs
- Argument Analysis: Infers argument count and complexity from actual tool calls in the session
- Complexity Detection: Distinguishes between simple arguments and complex ones (arrays/objects)
- Cache Hit Rate: Visual display of cache read vs fresh input token distribution
- Cost Savings: Calculates actual savings from prompt caching
- Effective Rate: Shows what you're actually paying per token vs standard rates
- 41+ Models Supported: Comprehensive pricing database for Claude, GPT, DeepSeek, Llama, Mistral, and more
- Cache-Aware Pricing: Properly handles cache read/write tokens with discounted rates
- Session-Wide Billing: Aggregates costs across all API calls in your session
- Child Session Analysis: Recursively analyzes all subagent sessions spawned by the Task tool
- Aggregated Totals: Shows combined tokens, costs, and API calls across main session and all subagents
- Per-Agent Breakdown: Lists each subagent with its type, token usage, cost, and API call count
- Optional Toggle: Enable/disable subagent analysis with the
includeSubagentsparameter
- Tool Usage Stats: Track which tools consume the most tokens and how many times each is called
- API Call Tracking: See total API calls for main session and subagents
- Top Contributors: Identify the biggest token consumers
- Model Normalization: Handles
provider/modelformat automatically - Multi-Tokenizer Support: Uses official tokenizers (tiktoken for OpenAI, transformers for others)
- Configurable Sections: Enable/disable analysis features via
tokenscope-config.json
- Available Skills: Shows all skills listed in the skill tool definition with their token cost
- Loaded Skills: Tracks skills loaded during the session with call counts
- Cumulative Token Tracking: Accurately counts token cost when skills are called multiple times
This section explains how OpenCode handles skills and why the token counting works the way it does.
Skills are on-demand instructions that agents can load via the skill tool. They have two token consumption points:
-
Available Skills List: Skill names and descriptions are embedded in the
skilltool's description as XML. This is part of the system prompt and costs tokens on every API call. -
Loaded Skill Content: When an agent calls
skill({ name: "my-skill" }), the full SKILL.md content is loaded and returned as a tool result.
Important: OpenCode does not deduplicate skill content. Each time the same skill is called, the full content is added to context again as a new tool result.
This means if you call skill({ name: "git-release" }) 3 times and it contains 500 tokens:
- Total context cost = 500 Γ 3 = 1,500 tokens
This behavior is by design in OpenCode. You can verify this in the source code:
| Component | Source Link |
|---|---|
| Skill tool execution | packages/opencode/src/tool/skill.ts |
| Tool result handling | packages/opencode/src/session/message-v2.ts |
| Skill pruning protection | packages/opencode/src/session/compaction.ts |
OpenCode protects skill tool results from being pruned during context management. From the compaction.ts source:
const PRUNE_PROTECTED_TOOLS = ["skill"]This means loaded skill content stays in context for the duration of the session (unless full session compaction/summarization occurs).
- Call skills sparingly: Since each call adds full content, avoid calling the same skill multiple times
- Monitor skill token usage: Use TokenScope to see which skills consume the most tokens
- Consider skill size: Large skills (1000+ tokens) can quickly inflate context when called repeatedly
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Token Analysis: Session ses_50c712089ffeshuuuJPmOoXCPX
Model: claude-opus-4-5
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
TOKEN BREAKDOWN BY CATEGORY
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Estimated using tokenizer analysis of message content:
Input Categories:
SYSTEM ββββββββββββββββββββββββββββββ 45.8% (22,367)
USER ββββββββββββββββββββββββββββββ 0.8% (375)
TOOLS ββββββββββββββββββββββββββββββ 53.5% (26,146)
Subtotal: 48,888 estimated input tokens
Output Categories:
ASSISTANT ββββββββββββββββββββββββββββββ 100.0% (1,806)
Subtotal: 1,806 estimated output tokens
Local Total: 50,694 tokens (estimated)
TOOL USAGE BREAKDOWN
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
bash ββββββββββββββββββββββββββββββ 34.0% (8,886) 4x
read ββββββββββββββββββββββββββββββ 33.1% (8,643) 3x
task ββββββββββββββββββββββββββββββ 27.7% (7,245) 4x
webfetch ββββββββββββββββββββββββββββββ 4.9% (1,286) 1x
tokenscope ββββββββββββββββββββββββββββββ 0.3% (75) 2x
batch ββββββββββββββββββββββββββββββ 0.0% (11) 1x
TOP CONTRIBUTORS
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β’ System (inferred from API) 22,367 tokens (44.1%)
β’ bash 8,886 tokens (17.5%)
β’ read 8,643 tokens (17.0%)
β’ task 7,245 tokens (14.3%)
β’ webfetch 1,286 tokens (2.5%)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
MOST RECENT API CALL
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Raw telemetry from last API response:
Input (fresh): 2 tokens
Cache read: 48,886 tokens
Cache write: 54 tokens
Output: 391 tokens
βββββββββββββββββββββββββββββββββββ
Total: 49,333 tokens
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
SESSION TOTALS (All 15 API calls)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Total tokens processed across the entire session (for cost calculation):
Input tokens: 10 (fresh tokens across all calls)
Cache read: 320,479 (cached tokens across all calls)
Cache write: 51,866 (tokens written to cache)
Output tokens: 3,331 (all model responses)
βββββββββββββββββββββββββββββββββββ
Session Total: 375,686 tokens (for billing)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ESTIMATED SESSION COST (API Key Pricing)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
You appear to be on a subscription plan (API cost is $0).
Here's what this session would cost with direct API access:
Input tokens: 10 Γ $5.00/M = $0.0001
Output tokens: 3,331 Γ $25.00/M = $0.0833
Cache read: 320,479 Γ $0.50/M = $0.1602
Cache write: 51,866 Γ $6.25/M = $0.3242
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ESTIMATED TOTAL: $0.5677
Note: This estimate uses standard API pricing from models.json.
Actual API costs may vary based on provider and context size.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
CONTEXT BREAKDOWN (Estimated from cache_write tokens)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Base System Prompt ββββββββββββββββββββββββββββββ ~42,816 tokens
Tool Definitions (14)ββββββββββββββββββββββββββββββ ~4,900 tokens
Environment Context ββββββββββββββββββββββββββββββ ~150 tokens
Project Tree ββββββββββββββββββββββββββββββ ~4,000 tokens
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Total Cached Context: ~51,866 tokens
Note: Breakdown estimated from first cache_write. Actual distribution may vary.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
TOOL DEFINITION COSTS (Estimated from argument analysis)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Tool Est. Tokens Args Complexity
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
task ~480 3 complex (arrays/objects)
batch ~410 1 complex (arrays/objects)
edit ~370 4 simple
read ~340 3 simple
bash ~340 3 simple
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Total: ~4,520 tokens (14 enabled tools)
Note: Estimates inferred from tool call arguments in this session.
Actual schema tokens may vary +/-20%.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
CACHE EFFICIENCY
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Token Distribution:
Cache Read: 320,479 tokens ββββββββββββββββββββββββββββββ 86.2%
Fresh Input: 51,320 tokens ββββββββββββββββββββββββββββββ 13.8%
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Cache Hit Rate: 86.2%
Cost Analysis (claude-opus-4-5 @ $5.00/M input, $0.50/M cache read):
Without caching: $1.8590 (371,799 tokens x $5.00/M)
With caching: $0.4169 (fresh x $5.00/M + cached x $0.50/M)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Cost Savings: $1.4421 (77.6% reduction)
Effective Rate: $1.12/M tokens (vs. $5.00/M standard)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
SUBAGENT COSTS (4 child sessions, 23 API calls)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
docs $0.3190 (194,701 tokens, 8 calls)
general $0.2957 (104,794 tokens, 4 calls)
docs $0.2736 (69,411 tokens, 4 calls)
general $0.5006 (197,568 tokens, 7 calls)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Subagent Total: $1.3888 (566,474 tokens, 23 calls)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
SUMMARY
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Cost Tokens API Calls
Main session: $ 0.5677 375,686 15
Subagents: $ 1.3888 566,474 23
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
TOTAL: $ 1.9565 942,160 38
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
41+ models with accurate pricing:
- Claude Opus 4.5, 4.1, 4
- Claude Sonnet 4, 4-5, 3.7, 3.5, 3
- Claude Haiku 4-5, 3.5, 3
- GPT-4, GPT-4 Turbo, GPT-4o, GPT-4o Mini
- GPT-3.5 Turbo
- GPT-5 and all its variations
- DeepSeek (R1, V2, V3)
- Llama (3.1, 3.2, 3.3)
- Mistral (Large, Small)
- Qwen, Kimi, GLM, Grok
- And more...
Free/Open models are marked with zero pricing.
The plugin includes a tokenscope-config.json file with these defaults:
{
"enableContextBreakdown": true,
"enableToolSchemaEstimation": true,
"enableCacheEfficiency": true,
"enableSubagentAnalysis": true,
"enableSkillAnalysis": true
}Set any option to false to hide that section from the output.
- Verify
tokenscope.mdexists:ls ~/.config/opencode/command/tokenscope.md - If missing, create it (see Installation step 3)
- Restart OpenCode completely
The plugin uses API telemetry (ground truth). If counts seem off:
- Expected ~2K difference from TUI: Plugin analyzes before its own response is added
- Model detection: Check that the model name is recognized in the output
- All processing is local: No session data sent to external services
- Open source: Audit the code yourself
Contributions welcome! Ideas for enhancement:
- Historical trend analysis
- Export to CSV/JSON/PDF
- Optimization suggestions
- Custom categorization rules
- Real-time monitoring with alerts
- Compare sessions
- Token burn rate calculation
- Issues: GitHub Issues
- Discussions: GitHub Discussions