bench: v0.2.5815 cross-corpus results — codedb vs codegraph vs lean-ctx by justrach · Pull Request #483 · justrach/codedb

justrach · 2026-05-21T02:59:31Z

Summary

Adds 4 raw run reports (+ aggregated log) from the v0.2.5815 search-shootout against three corpora — react, rust-lang/regex, pallets/flask — testing the released binary at /opt/homebrew/bin/codedb (SHA 51164cf9…e687d25f).

Headline:

codedb cold build (react, 6,620 files): 12.1 s → 1.18 s (~10× faster than prior RESULTS.md baseline)
xyzzy_react negative query: 113 ms → 0.07 ms (~1,600×)
flushPassiveEffects rare-camelcase: 167 ms → 0.15 ms (~1,100×)
codedb wins 13/15 warm queries vs codegraph on react

The two former slow-path outliers from the prior eval are gone on this binary.

Sonnet 4.6 agentic eval (same React-reconciler task across three backends):

backend	calls	wall s	found `getNextLanes`?
codegraph (`query`)	4	29	✅
lean-ctx (`grep`+`read`)	9	51	✅
codedb (`search`+`find`)	22	114	✅ — but had to reconstruct via 20+ searches because codedb CLI lacks a `read` subcommand

Full breakdown lives in the v0.2.5815 release notes:
https://github.com/justrach/codedb/releases/tag/v0.2.5815

Files

benchmarks/search-shootout/results/2026-05-21/react-run1.md — primary run, 5 backends
benchmarks/search-shootout/results/2026-05-21/react-run2.md — stability check
benchmarks/search-shootout/results/2026-05-21/regex-run1.md
benchmarks/search-shootout/results/2026-05-21/flask-run1.md
benchmarks/search-shootout/results/2026-05-21/run.log — combined stdout

Follow-up (not in this PR)

The codegraph numbers in these reports were collected via a sibling harness; the codegraph backend isn't wired into shootout.py's multi-session launcher yet. A follow-up PR will add the CodegraphMCP class and the --skip-codegraph / --codegraph-bin args to the main script. The data files in this PR are self-contained.

Test plan

codedb 0.2.5815 binary verified via codedb --version + shasum -a 256
All 4 reports regenerated cleanly on a clean machine (snapshot wiped each run)
Numbers spot-checked against release-notes claims (cold build, outlier speedups)

🤖 Generated with Claude Code

…h vs lean-ctx (2026-05-21) Per-corpus search-latency runs against the released v0.2.5815 binary (/opt/homebrew/bin/codedb, SHA 51164cf9…e687d25f) on three corpora: - react (6,620 files) — runs 1 and 2 for stability - regex (285 files) - flask (127 files) Backends compared (default tools): - codedb_search (MCP) - codegraph_search (codegraph 0.7.10 MCP, `codegraph serve --mcp`) - lean-ctx grep (lean-ctx 3.6.9 CLI, per-call spawn) - SQLite FTS5 trigram + unicode61 (inverted-index baselines) Two outliers from prior RESULTS.md are gone on this binary: - xyzzy_react_does_not_exist (negative) 113 ms → 0.07 ms (~1,600×) - flushPassiveEffects (rare camelcase) 167 ms → 0.15 ms (~1,100×) - cold build (react, 6,620 files) 12.1 s → 1.18 s (~10×) codedb wins 13/15 react warm queries vs codegraph. codegraph wins on the two highest-frequency stress queries (`function`, `set`) where codedb falls back to a slower path on >5k hits. Headline numbers and the per-task Sonnet 4.6 agentic eval are now in the v0.2.5815 release notes: https://github.com/justrach/codedb/releases/tag/v0.2.5815 Follow-up: wire codegraph backend into shootout.py multi-session launcher (currently runs only codedb / fts5 / lean-ctx; codegraph results in this commit were collected via a sibling harness). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions · 2026-05-21T05:21:44Z

Benchmark Regression Report

Thresholds: 10.00% and 50,000 ns absolute delta

NOISE means the percentage threshold was exceeded, but the absolute delta was too small to fail CI.

Tool	Base (ns)	Head (ns)	Delta	Abs Delta (ns)	Status
`codedb_bundle`	569442	559162	-1.81%	-10280	OK
`codedb_changes`	58844	59967	+1.91%	+1123	OK
`codedb_deps`	11284	11544	+2.30%	+260	OK
`codedb_edit`	7108	6822	-4.02%	-286	OK
`codedb_find`	67657	65549	-3.12%	-2108	OK
`codedb_hot`	111318	116402	+4.57%	+5084	OK
`codedb_outline`	340446	340942	+0.15%	+496	OK
`codedb_read`	110378	107755	-2.38%	-2623	OK
`codedb_search`	160158	163086	+1.83%	+2928	OK
`codedb_snapshot`	306256	304493	-0.58%	-1763	OK
`codedb_status`	14712	18389	+24.99%	+3677	NOISE
`codedb_symbol`	64101	64695	+0.93%	+594	OK
`codedb_tree`	85536	87783	+2.63%	+2247	OK
`codedb_word`	94670	93577	-1.15%	-1093	OK

… security Bumps semver to 0.2.5817. Bundles the v0.2.5816 perf+security release (PRs #484, #485, #483, #486, #487) with the experiment/reader-md feature that auto-prepends a hash-verified codebase map to codedb_context. Highlights vs v0.2.5815: Performance (PR #485, deterministic microbenchmarks): Suspense regex p50: 2.82 ms → 0.18 ms (15.6× faster) useState regex p99: 16.57 ms → 2.04 ms (8.1× p99 reduction) CLI surface (PR #484): + codedb read <path> [-L FROM-TO] [--compact] + path-safety + sensitive-file guards + project-root anchoring (uses configured root, not cwd) codedb_context (NEW in 0.2.5817): + auto-prepends .codedb/reader.md when source_hash matches + inline ~6 lines of body for ≤3 symbol_definitions + new "## Callers" section pre-surfaces execution sites + skip-on-short-task gate (≤80 chars) to avoid overhead on narrow lookups reader.md security (this branch): + path-traversal blocked (no absolute / .. in source_files) + source_files capped at 20 (DoS guard) + loc_actual capped at 240 (body bloat guard) + golden blake2b roundtrip test Eval (Sonnet 4.6, n=3 per task, vs v0.2.5815 main lineage): T1 flask median: 5 → 4 (-1) T2 regex median: 13 → 7 (-6) T3 react median: 13 → 10 (-3) All 9 runs across the matrix returned correct answers. Branch wins on median, mode, and best-case for every task. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Wires the codegraph 0.7.10 backend into the single-session + multi-session launcher alongside codedb / fts5_tri / fts5_uni / lean-ctx. Uses `codegraph serve --mcp` as a long-lived stdio child and invokes `codegraph_search` as the default symbol-lookup tool — apples-to-apples with codedb_search. New CLI flags: --codegraph-bin <path> default: $(which codegraph) --skip-codegraph skip the backend entirely --clean-codegraph wipe matching .codegraph/ before indexing Cold-index helper `codegraph_cold_index` invokes `codegraph init` then `codegraph index` and measures wall-clock + .codegraph/ on-disk size. Smoke-tested codegraph-only on flask: cold build: 0.57 s, ~3.7 MB warm queries: 0.2–2 ms p50 (matches the bench numbers from the v0.2.5815 cross-corpus run committed in PR justrach#483) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

justrach mentioned this pull request May 21, 2026

bench(shootout): add codegraph backend to shootout.py #487

Merged

3 tasks

justrach added a commit that referenced this pull request May 21, 2026

Merge PR #483: v0.2.5815 cross-corpus bench results

0f3574c

justrach mentioned this pull request May 21, 2026

release: v0.2.5816 — read CLI + Tier 5 fix + bench data + ACE spec + shootout codegraph #488

Closed

6 tasks

justrach mentioned this pull request May 21, 2026

release: v0.2.5817 — reader.md auto-prepend + perf + security #490

Merged

7 tasks

justrach merged commit 9358fe8 into main May 21, 2026
1 of 2 checks passed

justrach deleted the feat/bench-codegraph-v0.2.5815 branch May 21, 2026 06:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bench: v0.2.5815 cross-corpus results — codedb vs codegraph vs lean-ctx#483

bench: v0.2.5815 cross-corpus results — codedb vs codegraph vs lean-ctx#483
justrach merged 1 commit into
mainfrom
feat/bench-codegraph-v0.2.5815

justrach commented May 21, 2026

Uh oh!

github-actions Bot commented May 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

justrach commented May 21, 2026

Summary

Files

Follow-up (not in this PR)

Test plan

Uh oh!

github-actions Bot commented May 21, 2026

Benchmark Regression Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant