TokenCost

English | 中文

A unified LLM token cost CLI for usage normalization, cost estimation, and multi-provider pricing. 💰

Why TokenCost?

LLM providers report token usage differently. OpenAI says prompt_tokens, Anthropic says input_tokens, Vercel AI SDK says inputTokens. Pricing is scattered across provider pages and changes frequently.

TokenCost solves this:

Normalize any provider's token usage to a unified struct
Resolve pricing from LiteLLM catalog, OpenRouter API, or built-in static maps
Calculate USD cost with a single function call

Quick Start

As a CLI

cargo install tokencost

# Look up pricing
tokencost price gpt-4o
tokencost price qwen3-max
tokencost price kimi-k2.5

# Calculate cost
tokencost cost claude-sonnet-4-20250514 --input 10000 --output 2000

# Normalize any provider's format
tokencost normalize '{"prompt_tokens":100,"completion_tokens":50}'

# Aggregate costs from JSONL
tokencost tally calls.jsonl

As a Library

use tokencost::{normalize_token_usage, estimate_usd_cost, pricing_from_usd_per_million};
use serde_json::json;

// Normalize any provider's token usage format
let raw = json!({"prompt_tokens": 100, "completion_tokens": 50});
let usage = normalize_token_usage(&raw).unwrap();
assert_eq!(usage.input_tokens, 100);

// Calculate cost
let pricing = pricing_from_usd_per_million(3.0, 15.0).unwrap();
let cost = estimate_usd_cost(Some(&usage), Some(&pricing)).unwrap();
println!("Total: ${:.6}", cost.total_usd);

CLI Commands

Command	Description
`tokencost normalize <json>`	Normalize token usage from any provider format
`tokencost price <model>`	Look up pricing for a model
`tokencost cost <model> -i N -o N`	Calculate cost for given token counts
`tokencost tally <file>`	Aggregate costs from a JSONL file
`tokencost catalog`	Update pricing catalog caches

All commands support --json for machine-readable output.

Examples

# Price lookup with JSON output
$ tokencost --json price deepseek-chat
{
  "model": "deepseek-chat",
  "input_usd_per_million": 0.27,
  "output_usd_per_million": 1.10,
  "cache_read_usd_per_million": 0.07
}

# Cost calculation
$ tokencost cost gpt-4o --input 10000 --output 2000
Input:      10000 tokens  x  $2.50 / 1M tokens  =  $0.0250
Output:      2000 tokens  x  $10.00 / 1M tokens  =  $0.0200
Total:   $0.0450

# Batch aggregation
$ tokencost tally calls.jsonl
┌──────────────────────────┬───────┬───────┬────────┬───────────┐
│ Model                    ┆ Calls ┆ Input ┆ Output ┆ Cost      │
╞══════════════════════════╪═══════╪═══════╪════════╪═══════════╡
│ claude-sonnet-4-20250514 ┆ 42    ┆ 120300┆ 30100  ┆ $0.812700 │
│ gpt-4o                   ┆ 18    ┆ 54000 ┆ 12600  ┆ $0.261000 │
│ TOTAL                    ┆ 60    ┆       ┆        ┆ $1.073700 │
└──────────────────────────┴───────┴───────┴────────┴───────────┘

Supported Providers

Built-in Static Pricing (59 models, works offline)

Provider	Models	Cache Pricing
Anthropic	claude-sonnet-4, claude-opus-4, claude-haiku-4.5	read + write
OpenAI	gpt-4o, gpt-4.1, o1, o3, o4-mini, etc.	read
Google	gemini-2.5-pro, gemini-2.5-flash, gemini-2.0-flash	reasoning
DeepSeek	deepseek-chat, deepseek-reasoner	read
Alibaba Qwen	qwen3-max, qwen-plus, qwen-turbo, qwen-flash, etc.	-
Zhipu GLM	glm-5, glm-4.7, glm-4.5, flash (free)	-
Moonshot Kimi	kimi-k2.5, kimi-k2, kimi-k2-thinking, etc.	read
ByteDance Doubao	doubao-2.0-pro, doubao-1.5-pro, doubao-lite, etc.	-
MiniMax	minimax-m2.5, minimax-m2, minimax-01	-
StepFun	step-3, step-3.5-flash, step-2-mini	read
Baichuan	baichuan4, baichuan3-turbo	-
01.AI	yi-lightning, yi-large	-
iFlytek Spark	spark-max, spark-lite (free)	-

Dynamic Catalogs (1,200+ models)

Source	Coverage	Cache Strategy
LiteLLM	1,200+ models across all providers	Disk cache, 7-day TTL, ETag/Last-Modified
OpenRouter	800+ models	In-memory, requires API key

Token Usage Normalization

tokencost normalizes 15+ field name variants across providers:

Provider	Input	Output	Reasoning	Cache
OpenAI	`prompt_tokens`	`completion_tokens`	-	-
Anthropic	`input_tokens`	`output_tokens`	-	`cache_creation_input_tokens`, `cache_read_input_tokens`
Vercel AI SDK	`inputTokens`	`outputTokens`	`reasoningTokens`	-
Google	`prompt_tokens_total`	`completion_tokens`	-	-

All normalized to a single TokenUsage struct.

Feature Flags

Feature	Default	Description
`cli`	yes	CLI binary (includes `catalog`)
`catalog`	yes	Network catalog fetching (LiteLLM, OpenRouter)
(none)	-	Pure computation library, zero network dependencies

# Use as a pure library without network/CLI
[dependencies]
tokencost = { version = "0.1", default-features = false }

Installation

# From crates.io
cargo install tokencost

# From source
git clone https://github.com/justinhuangcode/tokencost
cd tokencost
cargo install --path .

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github		.github
src		src
tests		tests
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
README_CN.md		README_CN.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TokenCost

Why TokenCost?

Quick Start

As a CLI

As a Library

CLI Commands

Examples

Supported Providers

Built-in Static Pricing (59 models, works offline)

Dynamic Catalogs (1,200+ models)

Token Usage Normalization

Feature Flags

Installation

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TokenCost

Why TokenCost?

Quick Start

As a CLI

As a Library

CLI Commands

Examples

Supported Providers

Built-in Static Pricing (59 models, works offline)

Dynamic Catalogs (1,200+ models)

Token Usage Normalization

Feature Flags

Installation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages