Collective Compress

The new standard of context compression.
Trim long prompts before inference — keep what matters, cut ~65% of wasted tokens.

Quick start · API · Website · Architecture

by Desenyon

Every oversized prompt burns GPU cycles, grid power, and cooling water on tokens that never needed to run. Collective Compress learns which lines to keep for the current question — under a fixed token budget — using a ~5K-parameter CPU policy. No PyTorch. No GPU. Sub-millisecond eviction before your LLM call.

At a glance

	Collective Compress	Truncation / FIFO
KV savings @ 35% budget	~65%	~65%
Oracle recall	100%	~25%
Policy size	~5K params	rule-based
Runtime	Rust / CPU	CPU

Quick start

Website (Vercel)

The marketing site and in-browser demo live in web/. Deploy to Vercel with the root vercel.json, or run locally:

cd web && npm install && npm run dev

Open http://localhost:3000 — WebGL shader hero, live compression demo, API dashboard.

API server

export CC_ADMIN_TOKEN=your-secret-here
export CC_KEY_STORE_FILE=.data/api_keys.json
cargo run -p collective-compress-api

Endpoint	Description
`GET /docs`	Interactive OpenAPI (Swagger UI)
`POST /v1/compress`	Compress with `cc_live_…` API key
`POST /v1/compress/batch`	Batch compress (up to 32)
`POST /v1/compress/compare`	Compare all eviction policies
`POST /v1/admin/keys`	Create keys (admin token)

curl -X POST http://localhost:8080/v1/compress \
  -H "X-API-Key: cc_live_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"context":"long text…","query":"What matters?","budget_ratio":0.35}'

CLI

cargo build --release
echo "def foo(): return 42" | ./target/release/collective-compress -q "foo function"
./target/release/collective-compress demo   # middle-truncation failure case

Library

use collective_compress::compress_context;

let result = compress_context(
    "long agent context…",
    "What does fetch return when the row is missing?",
    0.35,
    None,
)?;
println!("{:.1}% KV saved", result.kv_savings_pct());

How it works

Long context + question
        ↓
  tokenize + 9-dim feature extraction (CPU)
        ↓
  ~5K-param MLP scores each token
        ↓
  eviction policy keeps budget + sinks + recent tail
        ↓
  compressed text → your LLM

Weights ship as assets/checkpoints/model.json. The browser demo loads the same file from web/public/model.json.

Project layout

Path	Description
`crates/collective-compress`	Core compression engine
`crates/collective-compress-api`	Axum HTTP API + OpenAPI
`crates/collective-compress-cli`	Command-line tool
`web/`	Next.js site for Vercel (shader hero, demo, dashboard)
`assets/checkpoints/`	Trained policy weights (`model.json`)

Configuration

Variable	Default	Description
`CC_ADMIN_TOKEN`	—	Admin secret for key management
`CC_KEY_STORE_FILE`	memory	Persist API keys to JSON
`CC_PORT`	`8080`	API listen port
`NEXT_PUBLIC_API_URL`	—	API URL for the web dashboard

No third-party credentials are bundled. Configure your own secrets via environment variables.

Development

cargo test                    # 15 Rust tests
cd web && npm run build       # Next.js production build
docker build -t collective-compress .

Author

Desenyon — Collective Compress

License

MIT — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
assets/checkpoints		assets/checkpoints
checkpoints		checkpoints
crates		crates
docs		docs
web		web
.dockerignore		.dockerignore
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
vercel.json		vercel.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Collective Compress

At a glance

Quick start

Website (Vercel)

API server

CLI

Library

How it works

Project layout

Configuration

Development

Author

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Collective Compress

At a glance

Quick start

Website (Vercel)

API server

CLI

Library

How it works

Project layout

Configuration

Development

Author

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages