Releases · JuliusBrussee/caveman

@scottconverse

v1.7.0 — Stats receipts, smart installer, cavecrew, MCP-shrink

The biggest release since v1.0. Caveman now has measurement (real token receipts, not vibes), an installer that detects 30+ agents and runs each one's native install, three caveman-mode subagents that emit ~60% fewer handoff tokens than vanilla, an MCP middleware that shrinks tool descriptions in flight, and a one-command rule-file dropper for any repo.

Plus a critical macOS installer fix that was silently breaking detection for every compound-spec provider (cursor / windsurf / continue / 28 others).

🆕 New: `/caveman-stats` — real receipts, not vibes

Caveman Stats
──────────────────────────────────
Session:  ...projects/my-app/abc123.jsonl
Turns:    47
──────────────────────────────────
Output tokens:         3,210
Cache-read tokens:     128,400
──────────────────────────────────
Est. tokens saved:     5,961 (~65%)
Est. saved (USD):      ~$0.089
──────────────────────────────────
Memory compressed:     2 files, ~1,920 tokens saved per session start

Reads the active Claude Code session JSONL directly — no model-side guessing. Pricing matches by model-id prefix (claude-sonnet-4-* → $15/M, etc.) so it stays correct across point releases. Lifetime aggregation via --all, time window via --since 7d / --since 24h, tweetable one-liner via --share.

Statusline savings badge — on by default. After your first /caveman-stats run the bar shows [CAVEMAN] ⛏ 12.4k (lifetime tokens saved) and updates every run. Opt out with CAVEMAN_STATUSLINE_SAVINGS=0. Suffix file is absent until stats has run, so fresh installs render no fake number.

29 tests cover: pricing prefix-match, lifetime aggregation latest-per-session semantics, --since parsing, malformed-duration rejection, compressed-memory pair detection, statusline default-on / opt-out / fresh-install behaviors, ANSI-escape stripping in the suffix file, and symlink-safe history append.

🆕 New: `caveman-shrink` — MCP middleware (published to npm)

caveman-shrink@0.1.0 is a stdio proxy that wraps any MCP server, intercepts tools/list / prompts/list / resources/list responses, and compresses the description fields. Same boundaries as the parent skill — code, URLs, paths, and identifiers stay byte-for-byte identical.

{
  "mcpServers": {
    "fs-shrunk": {
      "command": "npx",
      "args": ["caveman-shrink", "npx", "@modelcontextprotocol/server-filesystem", "/path/to/dir"]
    }
  }
}

V1 deliberately does not touch tool-call response bodies or request payloads — only prose-y description fields. Configurable extra fields via CAVEMAN_SHRINK_FIELDS, debug deltas via CAVEMAN_SHRINK_DEBUG=1. Auto-registered by install.sh (use --minimal to skip). 12 tests cover: article/filler/hedge stripping, fenced code preservation, inline code preservation, URL preservation, filesystem-path preservation, identifier preservation, real MCP-style descriptions, nested-object walking.

🆕 New: `cavecrew` — three caveman subagents for Claude Code

Subagent tool-output gets injected back into the main thread context. Vanilla Explore / reviewer agents return verbose prose that eats main-context fast. Cavecrew agents emit caveman-ultra by default, ~60% fewer handoff tokens.

Subagent	Model	Job	Output
`cavecrew-investigator`	haiku	Read-only locator. "Where is X defined?" / "what calls Y?" / "map this dir"	`path:line — \`symbol` — note`, grouped headers when 3+
`cavecrew-builder`	sonnet	1-2 file surgical edits. Refuses 3+ file scope.	`path:line-range — change ≤10 words.` + `verified: re-read OK`
`cavecrew-reviewer`	haiku	Diff/branch/file review. One line per finding.	`path:line: 🔴 bug: problem. fix.` + severity totals

Builder has no Bash tool — can't shell out, can't push, can't delete. Refusals are terminal one-liners (too-big. split: ..., needs-confirm. op: ..., regressed. revert path:line).

🆕 New: smart multi-agent installer

# macOS / Linux / WSL
curl -fsSL https://raw.githubusercontent.com/JuliusBrussee/caveman/main/install.sh | bash

# Windows
irm https://raw.githubusercontent.com/JuliusBrussee/caveman/main/install.ps1 | iex

Detects which AI coding agents are on your machine and runs each one's native install (plugin / extension / skill / rule file). Skips what you don't have. Safe to re-run.

Coverage: 33 agents. Native: Claude Code, Gemini CLI, Codex. IDE/VS Code-family: Cursor, Windsurf, Cline, Copilot, Continue, Kilo, Roo, Augment. CLI: Aider Desk, Amp, Bob, Crush, Devin, Droid, ForgeCode, Goose, iFlow, JetBrains Junie, Kiro CLI, Mistral Vibe, OpenHands, opencode, Qwen Code, Qoder, Rovo Dev, Tabnine, Trae, Warp, Replit Agent, Antigravity. Plus universal AGENTS.md / IDE rule files for everything else via npx skills.

Flag	What
`--all`	Plugin + hooks + statusline + MCP shrink + per-repo rule files. The full ride.
`--minimal`	Plugin/extension only.
`--with-hooks`	Standalone hooks + statusline + stats badge. On by default.
`--with-mcp-shrink`	Register caveman-shrink MCP proxy. On by default.
`--with-init`	Drop per-repo rule files for Cursor/Windsurf/Cline/Copilot/AGENTS.md.
`--list`	Print full agent matrix and exit.
`--only <agent>`	One target only (repeatable).
`--dry-run`	Preview, write nothing.

install.sh --help for full reference.

🆕 New: `caveman-init` (cavepack) — drop rule files into any repo

node tools/caveman-init.js              # writes rule files for all targets
node tools/caveman-init.js --dry-run    # preview
node tools/caveman-init.js --only cline # one target only

Idempotent — uses a sentinel string to detect prior installs and skip them. Existing rule files are preserved unless --force is passed. Appendable targets (AGENTS.md, .github/copilot-instructions.md) get the caveman block appended below your existing content; replace-mode targets (.cursor/rules/caveman.mdc, .windsurf/rules/caveman.md, .clinerules/caveman.md) get fresh files. 8 tests cover: greenfield create, idempotent skip, append-vs-replace semantics, sentinel detection, --force, --dry-run, --only.

🛠 Critical fixes

macOS installer detection was silently broken. detect_match() used awk -v RS='||' to split compound detection specs. BSD awk on macOS rejects || as a regex (illegal primary in regular expression), the awk silently failed, the fallback re-ran the whole spec as a single clause, and every compound-spec provider (cursor, windsurf, continue, kilo, roo, augment, aider-desk, bob, crush, devin, droid, forgecode, goose, iflow, kiro, mistral, openhands, opencode, qwen, rovodev, tabnine, trae, warp, replit) was undetectable on macOS. Replaced with bash parameter expansion (${rest%%||*} / ${rest#*||}) — works on bash 3.2+, no awk dependency. Verified with reproducer: compound match, fallback match, all-miss compound, single-clause all behave correctly.
--with-mcp-shrink registered a config that 404'd on first spawn. Pre-publish, claude mcp add caveman-shrink -- npx -y caveman-shrink would store an entry that failed every time Claude tried to spawn it. Now the installer probes npm view caveman-shrink first — package missing or registry unreachable degrades to a clean manual-config skip with a print-the-snippet fallback. Default-on restored after caveman-shrink@0.1.0 shipped to npm.
Windows install.ps1 syntax error from node -e. Powershell tokenizer truncated the embedded JS at double-quotes, breaking standalone hook install. Fix writes the script to a temp file, runs node script.js, and removes it. (Thanks @scottconverse — #250, fixes #249/#199/#72)
/caveman argument whitelist + symlinked-parent ~/.claude support. ~/.claude symlinked to another drive (legitimate pattern) was previously refused as a "symlink attack." Now: lstat parent, resolve, verify uid match (Unix) or home-dir prefix (Windows), then allow. The flag file itself still must never be a symlink — that's the actual clobber vector. Plus argument validation: only known mode tokens reach the flag write.
caveman-compress reliability. UTF-8 stdout enforced (Windows console no longer corrupts compressed output), empty/identical-input guards prevent silently writing zero-byte files, inline-code validation now matches the fenced-code rule, frontmatter cleanup preserves YAML.

🛡 Security

safeWriteFlag() extended to all flag-file writes. Adds: safeWriteFlag for the active-mode flag, appendFlag (with O_APPEND) for the lifetime stats log, and readFlag (with size cap + whitelist) for the per-turn reinforcement read. The whitelist (VALID_MODES) is the load-bearing part — without it, a flag pointing at ~/.ssh/id_rsa could exfiltrate via stdout.
Statusline scripts no longer trust the flag file blindly. Both caveman-statusline.sh and .ps1 now refuse symlinked flag/suffix files and strip control bytes (tr -d '\000-\037') so a hostile flag can't render terminal escape sequences in your status bar.

📊 Skill changes

Ultra-mode code-symbol guard. Previously the rule "abbreviate everything" was over-aggressive — symbols like useMemo or getUserById would get truncated. Now: ultra abbreviates prose, never code/symbols/error strings.
Auto-clarity expanded. Drop caveman for: security warnings, irreversible action confirmations, multi-step sequences where fragment ambiguity risks misread, compression itself creating ambiguity, user repeating a question.
Typst + LaTeX added to the protected-content list — math/markup blocks pass through untouched.

📚 Docs

README cleanup: install table now shows the smart installer first, manual install second, "What You Get" matrix collapsed to one table with footnotes, no more 4 nested <details> walls. V...

@malakhov-dmitrii

v1.6.0 — Hardening release

11 community PRs merged plus a hardened security model for the flag file. This release fixes two real crash bugs that were silently breaking installs, two local-file-clobber vulnerabilities, and several portability gaps.

Critical fixes

Hooks no longer crash when an ancestor package.json declares "type": "module". Before this fix, any user with ~/.claude/package.json set to ESM (common with several Claude Code plugins) hit ReferenceError: require is not defined in ES module scope on every session. SessionStart and UserPromptSubmit hooks both silently failed, the flag file was never written, and the statusline badge disappeared. Fixed by shipping hooks/package.json with {"type": "commonjs"} so the hooks pin themselves to CJS regardless of ancestor settings. (Thanks @malakhov-dmitrii — #174)
${CLAUDE_PLUGIN_ROOT} is now quoted in plugin.json. Plugin install paths containing spaces (e.g. ~/Library/Application Support/... on macOS, C:\Users\Some Name\... on Windows) used to break the node invocation entirely. (Thanks @MukundaKatta — #171)
Codex hook config shape updated to the current spec. The old shape silently no-op'd on Codex with hooks enabled. New shape uses the matcher + nested hooks array. Repo also now ships .codex/config.toml with codex_hooks = true so auto-activation actually works inside this repo on macOS/Linux. (Thanks @davidbits — #148)

Security fixes

safeWriteFlag() helper hardens all flag file writes against symlink attacks. The flag file at ~/.claude/.caveman-active is at a predictable path. Before this fix, a local attacker (or any sibling process running as the same user but with a smaller blast radius — sandboxed extension, container with a ~/.claude bind mount, etc.) could replace the flag with a symlink to any user-writable file (~/.ssh/authorized_keys, ~/.bashrc) and the next hook write would clobber the symlink target. The new helper, defined once in hooks/caveman-config.js and used by every write site:
- Refuses if the flag's parent directory is itself a symlink
- Refuses if the existing flag target is a symlink
- Opens with O_NOFOLLOW where supported (Linux/macOS)
- Writes atomically via temp file + rename()
- Sets 0600 permissions on creation
Verified against a real symlink-redirect attack: the target file is left untouched. (Thanks @tuanaiseo — #70 + #71, consolidated into one shared helper)

Quality of life

CLAUDE_CONFIG_DIR env var is now respected across all hooks, statusline scripts, and install/uninstall scripts. Users with non-default Claude Code config locations no longer have to symlink things into ~/.claude/. (Thanks @BrendanIzu — #146)
Natural-language activation actually works now. The README has long claimed talk like caveman and caveman mode activate caveman, and stop caveman / normal mode deactivate it. The mode tracker hook only ever matched /caveman slash commands, so the natural-language phrasings were a lie. Now they're not. (Thanks @hummat — #120)
Per-turn caveman reinforcement. SessionStart injects the full ruleset once, but other plugins inject competing style instructions every turn. The mode tracker now emits a small hookSpecificOutput reminder on every UserPromptSubmit so the model keeps caveman style even after context drift. Skipped for independent modes (commit, review, compress) which have their own behavior. (Thanks @hummat — #119)
skills/compress/scripts/ is now real Python files instead of broken Git symlinks. Previously symlinks inside the skills directory broke on Windows and on platforms that don't follow them. The compress sub-skill now ships as proper modules: compress.py, detect.py, validate.py, benchmark.py, plus cli.py and __main__.py. (Thanks @gladkia — #153)
README link fixes — several broken anchors corrected. (Thanks @nervana21 — #169)

Internal

Consolidated all flag file writes through safeWriteFlag(). Direct fs.writeFileSync on predictable user-owned paths is now an anti-pattern documented in CLAUDE.md.
CLAUDE.md updated to document the new helper, CLAUDE_CONFIG_DIR support, the hooks/package.json CJS marker, natural-language activation, and per-turn reinforcement.

Known gaps (tracked for v1.6.1)

The flag file read path in the new per-turn reinforcement still trusts whatever's in the file, so a sibling process that can write the flag could inject arbitrary content into the model's context as a "mode name." Real exploit precondition is narrow (attacker needs write access to ~/.claude/ but no read on other user files — i.e. a sandboxed sibling process), but the fix is cheap and will land in a follow-up patch: lstat + length cap + VALID_MODES whitelist on the read.

The statusline scripts (caveman-statusline.sh/.ps1) read the flag without validation either, which on a hostile flag could render terminal escape sequences. Same fix pattern, same patch.

Upgrade

Plugin users:

claude plugin update caveman

Standalone hook users:

bash hooks/install.sh --force

Other agents (Cursor/Windsurf/Cline/Copilot/etc.):

npx skills add JuliusBrussee/caveman

Runtime SKILL.md loading

Activation hook now reads skills/caveman/SKILL.md at runtime instead of hardcoding rules inline. Edits to the source of truth propagate automatically — no duplication to go stale.

Plugin installs resolve SKILL.md relative to the plugin root
Standalone installs (hooks only, no skills dir) fall back to a built-in minimal ruleset
commit/review/compress modes skip SKILL.md machinery entirely — they have their own independent skill files

Docs

Consolidate install details — Cursor, Windsurf, Cline, and Copilot install instructions merged into one block (#106)
Fix bash statusline badge example — use ANSI-C quoting so escape sequences render correctly (#57)

Bug fixes

Fix /caveman with off default writing "off" to flag file — mode tracker now guards against off and deletes the flag file, matching the activate hook behavior

Configurable default mode

Default mode is now configurable instead of always starting at full. Resolution order:

CAVEMAN_DEFAULT_MODE environment variable (highest priority)
Config file at ~/.config/caveman/config.json (XDG-compliant, cross-platform)
'full' (unchanged default — fully backward compatible)

export CAVEMAN_DEFAULT_MODE=ultra

{ "defaultMode": "lite" }

All install/uninstall scripts (bash + PowerShell) updated. Invalid modes silently fall through to default.

Closes #86

"off" mode — disable auto-activation

Set CAVEMAN_DEFAULT_MODE=off or {"defaultMode": "off"} in config to skip session-start activation entirely. No flag file written, no rules injected. User can still manually activate with /caveman during the session.

`/caveman-help` quick-reference card

New skill — type /caveman-help to display a terse reference card covering all modes, skills, triggers, configuration, and deactivation. One-shot display, no mode change.

Works in: Claude Code, Gemini CLI, Cursor, Windsurf, Cline, Copilot, and all agents via npx skills.

Bug fixes

Fix /caveman with off default writing "off" to flag file — mode tracker now guards against off and deletes the flag file, matching the activate hook behavior
Add test coverage for off mode — both activate hook and mode tracker paths now tested
Fix swapped step comments in uninstall.sh — steps 3 and 4 were numbered out of order
Update stale CLAUDE.md — /caveman default now references configurable default instead of hardcoded full

Fix: Codex plugin compress skill broken on Windows

The compress skill in the Codex plugin shipped as symlinks (plugins/caveman/skills/compress/SKILL.md and scripts). On Windows and any git setup with core.symlinks=false, these checked out as plain text files containing the target path — causing Codex to reject the skill with "missing YAML frontmatter."

Changes

Replaced symlinks with real file copies (scripts identical to source, SKILL.md adapted for plugin context)
Added CI sync step so future edits to caveman-compress/ auto-propagate to the plugin copy

Closes #92

Feature: Configurable default caveman mode

Default mode is now configurable instead of always starting at full. Resolution order:

CAVEMAN_DEFAULT_MODE environment variable (highest priority)
Config file at ~/.config/caveman/config.json (XDG-compliant, cross-platform)
'full' (unchanged default — fully backward compatible)

Example — env var:

export CAVEMAN_DEFAULT_MODE=ultra

Example — config file:

{ "defaultMode": "lite" }

All install/uninstall scripts (bash + PowerShell) updated. Invalid modes silently fall through to default.

Closes #86

Feature: `/caveman-help` quick-reference card

New skill — type /caveman-help to display a terse reference card covering all modes, skills, triggers, configuration, and deactivation. One-shot display, no mode change.

Works in: Claude Code, Gemini CLI, Cursor, Windsurf, Cline, Copilot, and all agents via npx skills.

Highlights

Add Claude Code statusline badge support, including shell and PowerShell badge scripts.
Add standalone Claude hook install/uninstall flows for macOS/Linux and Windows, with safer settings merge behavior and better reinstall checks.
Add always-on rule/instruction files plus sync workflow for Cursor, Windsurf, Cline, Copilot, Codex, Gemini, and repo agent docs.
Add local verification coverage for hook install/uninstall flows, synced artifacts, manifests, syntax, and caveman-compress fixtures.
Improve caveman-compress validation and wrapper stripping, plus benchmark/token-count handling updates.
Expand docs across README, hooks docs, and repo guidance, including clearer install behavior and statusline setup expectations.

Full Changelog: v1.3.5...v1.4.0

What's new

Plugin-bundled hooks — SessionStart and UserPromptSubmit hooks now ship in plugin.json. Install as a plugin and they auto-activate — no install.sh needed.
Mode-aware statusline badge — flag file at ~/.claude/.caveman-active now stores the active mode (full, lite, ultra, wenyan, commit, review). Statusline scripts can show [CAVEMAN:ULTRA] etc.
UserPromptSubmit mode tracker — detects /caveman ultra, /caveman-commit, etc. and updates the flag file in real time.
Best practices alignment — hooks use explicit timeout: 5, statusMessage, and follow official Claude Code hook docs.
Cleaner README — install section restructured: plugin install (recommended, includes hooks) vs npx skills (skills only).

Install

claude plugin marketplace add JuliusBrussee/caveman
claude plugin install caveman@caveman

Or for any agent: npx skills add JuliusBrussee/caveman

What's New

📜 文言文 (Wenyan) Mode

Classical Chinese literary compression — same technical accuracy, different era, fewer tokens. Three levels: wenyan-lite, wenyan-full, wenyan-ultra.

English:  "Your component re-renders because you create a new object reference each render."
Caveman:  "New object ref each render. Wrap in useMemo."
Wenyan:   "物出新參照，致重繪。useMemo Wrap之。"

🛠️ New Skills

caveman-commit — terse commit messages in Conventional Commits format. /caveman-commit
caveman-review — one-line PR review comments: location, problem, fix. /caveman-review

📊 Eval Harness

Three-arm eval methodology that measures real token compression honestly — skill vs terse control, not skill vs verbose baseline. Run it yourself:

uv run python evals/llm_run.py
uv run --with tiktoken python evals/measure.py

🔧 Fixes & Improvements

caveman-compress: Anthropic SDK direct call support (bypasses subprocess when ANTHROPIC_API_KEY is set)
caveman-compress: 500KB file size limit, path resolution, configurable model via CAVEMAN_MODEL env var
caveman-compress: SECURITY.md addressing Snyk false positive
caveman-compress: Backup overwrite protection — won't silently destroy existing .original.md
caveman-compress: Fixed PATH_REGEX that was matching every English word as a file path
Codex plugin: Windows install support, SVG icons, OpenAI agent config
Evals: Fixed inverted sign in fmt_pct output

🧹 Housekeeping

Removed language-specific skill translations (caveman-cn, caveman-es, caveman-pt) — Claude already responds in the user's language
Closed Abathur persona PRs — out of scope
README restructured: hook → pitch → install → skills → benchmarks → evals

What's new

Intensity levels — choose lite, full (default), or ultra caveman compression. Fine-grained control over how aggressive the token savings are.
Auto-Clarity — caveman mode automatically drops compression for security warnings and destructive operations. Safety-critical output stays crystal clear.
caveman-compress skill — new standalone skill that compresses natural language memory files (CLAUDE.md, todos, preferences) into caveman format. Preserves all technical substance while cutting tokens.
GitHub Action for SKILL.md sync — auto-syncs SKILL.md copies on push to main, keeping all skill entry points consistent.
GitHub Pages landing site — project now has a proper landing page.
SKILL.md prompt compression — ~40% fewer tokens in the skill prompt while keeping all behavioral anchors.

Upgrade

Update your skill installation to get intensity levels and auto-clarity automatically.

Full Changelog: v1.1.0...v1.2.0

What new

Reproducible benchmark system — benchmarks/run.py call Claude API, measure real output token counts normal vs caveman, auto-update README table. No more fake numbers.
Real benchmark data — 10 coding prompts, actual API measurements. Average 65% token savings (range 22%–87%).
Codex plugin support — caveman now work in OpenAI Codex too.
Contributing guide + issue templates for bug reports and feature requests.

Run benchmarks yourself

cd benchmarks
pip install -r requirements.txt
python run.py --dry-run          # preview, no API calls
python run.py --update-readme    # run + update README table

Full Changelog: v1.0.0...v1.1.0

Uh oh!

Releases: JuliusBrussee/caveman

v1.7.0 — Stats receipts, smart installer, cavecrew, MCP-shrink

v1.7.0 — Stats receipts, smart installer, cavecrew, MCP-shrink

🆕 New: /caveman-stats — real receipts, not vibes

🆕 New: caveman-shrink — MCP middleware (published to npm)

🆕 New: cavecrew — three caveman subagents for Claude Code

🆕 New: smart multi-agent installer

🆕 New: caveman-init (cavepack) — drop rule files into any repo

🛠 Critical fixes

🛡 Security

📊 Skill changes

📚 Docs

Contributors

Uh oh!

v1.6.0 — Hardening release: hook crash fixes + symlink-safe flag writes

v1.6.0 — Hardening release

Critical fixes

Security fixes

Quality of life

Internal

Known gaps (tracked for v1.6.1)

Upgrade

Contributors

Uh oh!

v1.5.1

Runtime SKILL.md loading

Docs

Bug fixes

Uh oh!

v1.5.0

Configurable default mode

"off" mode — disable auto-activation

/caveman-help quick-reference card

Bug fixes

Uh oh!

v1.4.1

Fix: Codex plugin compress skill broken on Windows

Changes

Feature: Configurable default caveman mode

Feature: /caveman-help quick-reference card

Uh oh!

v1.4.0

Highlights

Uh oh!

v1.3.5 — Plugin hooks & mode-aware statusline

What's new

Install

Uh oh!

v1.3.0 — 文言文, Skills, Evals & Community Fixes

What's New

📜 文言文 (Wenyan) Mode

🛠️ New Skills

📊 Eval Harness

🔧 Fixes & Improvements

🧹 Housekeeping

Uh oh!

v1.2.0 — Intensity Levels, Auto-Clarity & Caveman-Compress

What's new

Upgrade

Uh oh!

v1.1.0 — Real Benchmarks

What new

Run benchmarks yourself

Uh oh!

🆕 New: `/caveman-stats` — real receipts, not vibes

🆕 New: `caveman-shrink` — MCP middleware (published to npm)

🆕 New: `cavecrew` — three caveman subagents for Claude Code

🆕 New: `caveman-init` (cavepack) — drop rule files into any repo

`/caveman-help` quick-reference card

Feature: `/caveman-help` quick-reference card