Introduction
Compress LLM prompts to cut costs and improve accuracy.
What is The Token Company LLM input compression?
Compression middleware for cutting out low signal tokens bloating LLM context.
Backed by the Founders of

Why compress?
- Fit more context
- Faster inference
- Improve long context accuracy
- Cut costs by 10-40%
Quick example
from thetokencompany import TheTokenCompany
client = TheTokenCompany(api_key="ttc-...")
result = client.compress("Your long prompt text...")
print(result.output) # compressed text
print(result.tokens_saved) # tokens removed
print(result.compression_ratio) # e.g. 3.2x
How it works
- Send your prompt to the TTC compression API
- Receive compressed text back
- Pass the compressed text to any LLM (OpenAI, Claude, Gemini, etc.)
Next steps
- Quickstart - get running in 2 minutes
- OpenAI integration - automatic compression with OpenAI
- Anthropic integration - automatic compression with Claude
- Compression guide - fine-tune aggressiveness, gzip, and protected text