Skip to content

dethan3/ClawCompress

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ClawCompress

High-performance token compression for AI coding agents

ClawCompress is a Rust-based token optimization engine for AI coding agents. It compresses noisy command outputs — git diff, test logs, build errors, directory listings, and more — before they are inserted into the LLM context, dramatically reducing token usage.

Typical reduction: 60–95% fewer tokens

Works transparently with OpenClaw and Claude Code.


The Problem

AI coding agents run shell commands constantly. Raw output is verbose:

git diff        →  4,200 tokens  (full patch text)
cargo test      →  3,800 tokens  (all test output)
docker logs app →  5,100 tokens  (repeated log lines)

What the agent actually needs:

3 files changed  auth.rs +32-10  db.rs +8-2         →  80 tokens
Tests: 128 passed, 1 failed — test_login: got 500    →  60 tokens
ERROR connection refused (repeated 124x)              →  30 tokens

Without compression, tool output fills the context window and crowds out code.

How It Works

ClawCompress sits between command execution and the LLM context:

command execution
        ↓
capture stdout / stderr
        ↓
command classifier
        ↓
specialized compressor
        ↓
structured text summary
        ↓
LLM context

The full raw output is optionally saved as a local artifact for recovery, but never injected into the prompt.

Installation

git clone https://github.com/your-org/clawcompress
cd clawcompress
cargo build --release
cp target/release/clawcompress ~/.local/bin/

Proxy Mode

Use clawcompress as a drop-in prefix for any command. It executes the command and prints a compressed summary.

Git

clawcompress git status
# Modified: 3  Added: 1  Untracked: 2
# modified: src/auth.rs
# modified: src/db.rs

clawcompress git diff
clawcompress git diff HEAD~3
# 3 file(s) changed, +44 -12
# auth.rs  +32 -10
# db.rs    +8  -2

clawcompress git log -n 20
# 20 commit(s)
# abc1234  feat: add OAuth2 support  (2 days ago)  Alice
# def5678  fix: DB connection leak   (3 days ago)  Bob

clawcompress git add .
# ok  (3 file(s) staged)

clawcompress git commit -m "feat: new feature"
# ok  abc1234 feat: new feature

clawcompress git push
# ok  main -> origin/main

clawcompress git pull
# ok  (already up to date)

Supported subcommands: status diff log add commit push pull fetch stash checkout switch merge rebase branch reset tag

Cargo / Rust

clawcompress cargo test
clawcompress cargo test --lib -- my_module
# Tests: 128 run | 127 passed | 1 FAILED
# FAILED test_login
#   expected status 200, got 500

clawcompress cargo build --release
clawcompress cargo clippy
# Build FAILED  2 error(s)  1 warning(s)
# error[E0382]: use of moved value `x`
#   --> src/main.rs:42:5

Test Runners

clawcompress test pytest -q
clawcompress test jest --ci
clawcompress test go test ./...

Search

clawcompress grep -r "TODO" src/
clawcompress rg "FIXME|HACK" --type rust
# 18 matches in 4 files
# src/auth.rs      12 matches  [line 23: // TODO: refresh token]
# src/db.rs         4 matches  [line 67: // TODO: pooling]

Directory Listing

clawcompress ls
clawcompress ls -la src/
# 8 items (3 dirs, 5 files)
#   src/  tests/  benches/
#   Cargo.toml (2.1 KB)  README.md (8.4 KB)

File Reading

clawcompress read src/main.rs
clawcompress cat Cargo.toml
# Small files shown in full.
# Large files (>200 lines) shown as head + tail:
# [842 lines — showing first 100 and last 50]

Docker

clawcompress docker ps
# 3 container(s)
#   myapp     nginx:latest   Up 2 hours
#   postgres  postgres:15    Up 2 hours
#   redis     redis:7        Exited 1 hour ago

clawcompress docker images
clawcompress docker compose ps

Generic / Passthrough

# Generic compression: truncate + dedup + keep error lines
clawcompress summary make build
clawcompress summary kubectl get pods

# Pass through raw output unchanged, but track token savings
clawcompress proxy some-command --with args

Claude Code Integration

ClawCompress installs a PreToolUse hook in ~/.claude/settings.json that transparently rewrites Bash commands before they execute. Claude sees only the compressed output — no prompt changes required.

Install

clawcompress init --global        # install hook + generate CLAWCOMPRESS.md
clawcompress init --global --hook-only  # hook only, skip CLAWCOMPRESS.md
clawcompress init --show          # check current status
clawcompress init --uninstall     # remove hook

This adds to ~/.claude/settings.json:

{
  "hooks": {
    "PreToolUse": [{
      "matcher": "Bash",
      "hooks": [{"type": "command", "command": "/path/to/clawcompress __hook"}]
    }]
  }
}

Rewrite Rules

Original command Rewritten to
git status clawcompress git status
git diff / git log clawcompress git diff/log
git push / git commit clawcompress git push/commit
cargo test / cargo build clawcompress cargo test/build
grep / rg clawcompress grep/rg
ls clawcompress ls
cat <file> clawcompress read <file>
docker ps / docker images clawcompress docker ps/images
pytest / jest / vitest clawcompress test <cmd>

Not rewritten: commands already prefixed with clawcompress, heredocs (<<), and simple builtins like echo, mkdir, cp.

CLAWCOMPRESS.md

init generates CLAWCOMPRESS.md in the current directory to inform the agent that compression is active. Reference it from CLAUDE.md if desired.

OpenClaw Integration

ClawCompress integrates directly with OpenClaw's exec tool via the OpenClawAdapter:

OpenClaw exec
       ↓
capture stdout / stderr
       ↓
ClawCompress pipeline
       ↓
compressed result + metrics
       ↓
LLM context
# Execute and compress (with stats tracking)
clawcompress exec cargo test --stats-file ~/.clawcompress/stats.json

# Compress stdin input
git status | clawcompress compress git status

# Analyze an existing output file
clawcompress analyze output.log --command "cargo build"

Token Savings Analytics

clawcompress gain              # summary
clawcompress gain --graph      # ASCII bar chart by command type
clawcompress gain --json       # export as JSON

Example output:

ClawCompress Savings
====================
Runs:          247
Tokens saved:  184,320  (avg 70.5% reduction)
Raw total:     261,890
Compressed:     77,570

Per-command breakdown:
  git-status   ████████████████████  45 runs
  git-diff     ████████████████      38 runs
  test         ██████████████████    32 runs
  compiler     ████████████          28 runs
  grep         ██████████            22 runs

Stats are automatically saved to ~/.clawcompress/stats.json after every proxy run.

Tee Mode (Failure Recovery)

On a non-zero exit code, ClawCompress saves the full raw output to a local file and appends the path to the compressed output.

Default save location: ~/.local/share/clawcompress/tee/

clawcompress cargo test
# Tests: 3 run | 0 passed | 3 FAILED
# FAILED test_auth  ...
# [full output: ~/.local/share/clawcompress/tee/1234567890_cargo_test.log]

Override the tee directory per run:

clawcompress git push --tee /tmp/push-log/

Compressors

Kind Triggered by What it does
git-status git status Structured file list, strips all hint text
git-diff git diff File list + line stats + first hunk preview
git-log git log One-line: hash · subject · relative time · author
git-simple git add/commit/push/… Single-line "ok" or error summary
test pytest / cargo test / go test / jest Stats + failing tests only
compiler cargo build / clippy / tsc / gcc Errors and warnings, noise filtered
logs docker logs / kubectl logs Dedup + error extraction + last error block
grep grep / rg / ag Grouped by file + match count + first line preview
json Any JSON stdout Type + top-level keys + array lengths + depth
ls ls / tree / exa / eza Dir/file counts + classified list
file-read cat / head / tail / read Full for small files; head+tail for large
docker docker ps / images / compose Compact container/image table
generic Everything else Truncate + dedup + keep error lines

All compressors output plain text by default (LLM-friendly). Add --format json for structured JSON output.

Architecture

clawcompress
│
├── src/
│   ├── classifier/command_classifier.rs   command-type detection
│   ├── compressors/
│   │   ├── git_status.rs                  git status
│   │   ├── git_diff.rs                    git diff
│   │   ├── git_log.rs                     git log
│   │   ├── git_simple.rs                  git write ops
│   │   ├── test_runner.rs                 pytest / cargo test / jest
│   │   ├── compiler.rs                    rustc / clang / tsc
│   │   ├── logs.rs                        streaming log dedup
│   │   ├── grep.rs                        grep / rg
│   │   ├── json.rs                        JSON structure summary
│   │   ├── ls.rs                          directory listing
│   │   ├── file_reader.rs                 cat / head / tail
│   │   ├── docker.rs                      docker ps / images
│   │   └── generic.rs                     fallback
│   ├── pipeline/compression_pipeline.rs   classify → compress
│   ├── hook/mod.rs                        Claude Code hook install
│   ├── proxy/mod.rs                       execution + tee + output
│   ├── metrics/                           token estimation + stats
│   ├── artifact/raw_output_store.rs       raw output storage
│   ├── integration/openclaw_adapter.rs    OpenClaw adapter
│   ├── lib.rs                             core data structures
│   └── main.rs                            CLI entry point
│
├── tests/integration_test.rs              19 integration tests
└── benches/compression_benchmarks.rs      Criterion benchmarks

Token estimation: tokens ≈ characters / 4

Benchmarks

Command Raw Tokens Compressed Reduction
git diff (medium PR) 4,200 310 92%
cargo test (128 tests) 3,800 120 96%
docker logs (1000 lines) 5,100 180 96%
grep -r pattern src/ 1,300 90 93%
git log -n 50 2,400 240 90%
ls -la (large dir) 800 60 92%
cargo build (with errors) 1,600 140 91%

Processing time: < 10ms for typical inputs (< 1 MB).

Design Principles

  • No LLM compression — all rules are deterministic algorithms
  • Semantics preserved — exit codes passed through unchanged, error signals retained
  • Fail-safe — falls back to raw output if compression fails
  • Fast — single-threaded < 10ms; rayon parallelism for large inputs
  • Extensible — each command type is an independent compressor module

Contributing

Contributions welcome:

  • New compressors (kubectl, npm, make, terraform)
  • Compression rule improvements
  • New agent integrations (Cursor, Copilot, Gemini CLI)
  • Benchmark datasets

License

MIT License

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages