bench: v0.2.5815 cross-corpus results — codedb vs codegraph vs lean-ctx#483
Merged
Conversation
…h vs lean-ctx (2026-05-21) Per-corpus search-latency runs against the released v0.2.5815 binary (/opt/homebrew/bin/codedb, SHA 51164cf9…e687d25f) on three corpora: - react (6,620 files) — runs 1 and 2 for stability - regex (285 files) - flask (127 files) Backends compared (default tools): - codedb_search (MCP) - codegraph_search (codegraph 0.7.10 MCP, `codegraph serve --mcp`) - lean-ctx grep (lean-ctx 3.6.9 CLI, per-call spawn) - SQLite FTS5 trigram + unicode61 (inverted-index baselines) Two outliers from prior RESULTS.md are gone on this binary: - xyzzy_react_does_not_exist (negative) 113 ms → 0.07 ms (~1,600×) - flushPassiveEffects (rare camelcase) 167 ms → 0.15 ms (~1,100×) - cold build (react, 6,620 files) 12.1 s → 1.18 s (~10×) codedb wins 13/15 react warm queries vs codegraph. codegraph wins on the two highest-frequency stress queries (`function`, `set`) where codedb falls back to a slower path on >5k hits. Headline numbers and the per-task Sonnet 4.6 agentic eval are now in the v0.2.5815 release notes: https://github.com/justrach/codedb/releases/tag/v0.2.5815 Follow-up: wire codegraph backend into shootout.py multi-session launcher (currently runs only codedb / fts5 / lean-ctx; codegraph results in this commit were collected via a sibling harness). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Benchmark Regression ReportThresholds: 10.00% and 50,000 ns absolute delta
|
7 tasks
justrach
added a commit
that referenced
this pull request
May 21, 2026
… security Bumps semver to 0.2.5817. Bundles the v0.2.5816 perf+security release (PRs #484, #485, #483, #486, #487) with the experiment/reader-md feature that auto-prepends a hash-verified codebase map to codedb_context. Highlights vs v0.2.5815: Performance (PR #485, deterministic microbenchmarks): Suspense regex p50: 2.82 ms → 0.18 ms (15.6× faster) useState regex p99: 16.57 ms → 2.04 ms (8.1× p99 reduction) CLI surface (PR #484): + codedb read <path> [-L FROM-TO] [--compact] + path-safety + sensitive-file guards + project-root anchoring (uses configured root, not cwd) codedb_context (NEW in 0.2.5817): + auto-prepends .codedb/reader.md when source_hash matches + inline ~6 lines of body for ≤3 symbol_definitions + new "## Callers" section pre-surfaces execution sites + skip-on-short-task gate (≤80 chars) to avoid overhead on narrow lookups reader.md security (this branch): + path-traversal blocked (no absolute / .. in source_files) + source_files capped at 20 (DoS guard) + loc_actual capped at 240 (body bloat guard) + golden blake2b roundtrip test Eval (Sonnet 4.6, n=3 per task, vs v0.2.5815 main lineage): T1 flask median: 5 → 4 (-1) T2 regex median: 13 → 7 (-6) T3 react median: 13 → 10 (-3) All 9 runs across the matrix returned correct answers. Branch wins on median, mode, and best-case for every task. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
shabarkin
pushed a commit
to shabarkin/codedb
that referenced
this pull request
May 22, 2026
Wires the codegraph 0.7.10 backend into the single-session + multi-session launcher alongside codedb / fts5_tri / fts5_uni / lean-ctx. Uses `codegraph serve --mcp` as a long-lived stdio child and invokes `codegraph_search` as the default symbol-lookup tool — apples-to-apples with codedb_search. New CLI flags: --codegraph-bin <path> default: $(which codegraph) --skip-codegraph skip the backend entirely --clean-codegraph wipe matching .codegraph/ before indexing Cold-index helper `codegraph_cold_index` invokes `codegraph init` then `codegraph index` and measures wall-clock + .codegraph/ on-disk size. Smoke-tested codegraph-only on flask: cold build: 0.57 s, ~3.7 MB warm queries: 0.2–2 ms p50 (matches the bench numbers from the v0.2.5815 cross-corpus run committed in PR justrach#483) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds 4 raw run reports (+ aggregated log) from the v0.2.5815 search-shootout against three corpora — react, rust-lang/regex, pallets/flask — testing the released binary at
/opt/homebrew/bin/codedb(SHA51164cf9…e687d25f).Headline:
xyzzy_reactnegative query: 113 ms → 0.07 ms (~1,600×)flushPassiveEffectsrare-camelcase: 167 ms → 0.15 ms (~1,100×)The two former slow-path outliers from the prior eval are gone on this binary.
Sonnet 4.6 agentic eval (same React-reconciler task across three backends):
getNextLanes?query)grep+read)search+find)readsubcommandFull breakdown lives in the v0.2.5815 release notes:
https://github.com/justrach/codedb/releases/tag/v0.2.5815
Files
benchmarks/search-shootout/results/2026-05-21/react-run1.md— primary run, 5 backendsbenchmarks/search-shootout/results/2026-05-21/react-run2.md— stability checkbenchmarks/search-shootout/results/2026-05-21/regex-run1.mdbenchmarks/search-shootout/results/2026-05-21/flask-run1.mdbenchmarks/search-shootout/results/2026-05-21/run.log— combined stdoutFollow-up (not in this PR)
The codegraph numbers in these reports were collected via a sibling harness; the codegraph backend isn't wired into
shootout.py's multi-session launcher yet. A follow-up PR will add theCodegraphMCPclass and the--skip-codegraph/--codegraph-binargs to the main script. The data files in this PR are self-contained.Test plan
codedb --version+shasum -a 256🤖 Generated with Claude Code