Skip to content

bench: v0.2.5815 cross-corpus results — codedb vs codegraph vs lean-ctx#483

Merged
justrach merged 1 commit into
mainfrom
feat/bench-codegraph-v0.2.5815
May 21, 2026
Merged

bench: v0.2.5815 cross-corpus results — codedb vs codegraph vs lean-ctx#483
justrach merged 1 commit into
mainfrom
feat/bench-codegraph-v0.2.5815

Conversation

@justrach

Copy link
Copy Markdown
Owner

Summary

Adds 4 raw run reports (+ aggregated log) from the v0.2.5815 search-shootout against three corpora — react, rust-lang/regex, pallets/flask — testing the released binary at /opt/homebrew/bin/codedb (SHA 51164cf9…e687d25f).

Headline:

  • codedb cold build (react, 6,620 files): 12.1 s → 1.18 s (~10× faster than prior RESULTS.md baseline)
  • xyzzy_react negative query: 113 ms → 0.07 ms (~1,600×)
  • flushPassiveEffects rare-camelcase: 167 ms → 0.15 ms (~1,100×)
  • codedb wins 13/15 warm queries vs codegraph on react

The two former slow-path outliers from the prior eval are gone on this binary.

Sonnet 4.6 agentic eval (same React-reconciler task across three backends):

backend calls wall s found getNextLanes?
codegraph (query) 4 29
lean-ctx (grep+read) 9 51
codedb (search+find) 22 114 ✅ — but had to reconstruct via 20+ searches because codedb CLI lacks a read subcommand

Full breakdown lives in the v0.2.5815 release notes:
https://github.com/justrach/codedb/releases/tag/v0.2.5815

Files

  • benchmarks/search-shootout/results/2026-05-21/react-run1.md — primary run, 5 backends
  • benchmarks/search-shootout/results/2026-05-21/react-run2.md — stability check
  • benchmarks/search-shootout/results/2026-05-21/regex-run1.md
  • benchmarks/search-shootout/results/2026-05-21/flask-run1.md
  • benchmarks/search-shootout/results/2026-05-21/run.log — combined stdout

Follow-up (not in this PR)

The codegraph numbers in these reports were collected via a sibling harness; the codegraph backend isn't wired into shootout.py's multi-session launcher yet. A follow-up PR will add the CodegraphMCP class and the --skip-codegraph / --codegraph-bin args to the main script. The data files in this PR are self-contained.

Test plan

  • codedb 0.2.5815 binary verified via codedb --version + shasum -a 256
  • All 4 reports regenerated cleanly on a clean machine (snapshot wiped each run)
  • Numbers spot-checked against release-notes claims (cold build, outlier speedups)

🤖 Generated with Claude Code

…h vs lean-ctx (2026-05-21)

Per-corpus search-latency runs against the released v0.2.5815 binary
(/opt/homebrew/bin/codedb, SHA 51164cf9…e687d25f) on three corpora:

  - react (6,620 files)   — runs 1 and 2 for stability
  - regex (285 files)
  - flask (127 files)

Backends compared (default tools):
  - codedb_search (MCP)
  - codegraph_search (codegraph 0.7.10 MCP, `codegraph serve --mcp`)
  - lean-ctx grep (lean-ctx 3.6.9 CLI, per-call spawn)
  - SQLite FTS5 trigram + unicode61 (inverted-index baselines)

Two outliers from prior RESULTS.md are gone on this binary:

  - xyzzy_react_does_not_exist (negative)   113 ms → 0.07 ms (~1,600×)
  - flushPassiveEffects (rare camelcase)    167 ms → 0.15 ms (~1,100×)
  - cold build (react, 6,620 files)         12.1 s → 1.18 s (~10×)

codedb wins 13/15 react warm queries vs codegraph. codegraph wins on the
two highest-frequency stress queries (`function`, `set`) where codedb
falls back to a slower path on >5k hits.

Headline numbers and the per-task Sonnet 4.6 agentic eval are now in
the v0.2.5815 release notes:
  https://github.com/justrach/codedb/releases/tag/v0.2.5815

Follow-up: wire codegraph backend into shootout.py multi-session
launcher (currently runs only codedb / fts5 / lean-ctx; codegraph
results in this commit were collected via a sibling harness).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions

Copy link
Copy Markdown

Benchmark Regression Report

Thresholds: 10.00% and 50,000 ns absolute delta

NOISE means the percentage threshold was exceeded, but the absolute delta was too small to fail CI.

Tool Base (ns) Head (ns) Delta Abs Delta (ns) Status
codedb_bundle 569442 559162 -1.81% -10280 OK
codedb_changes 58844 59967 +1.91% +1123 OK
codedb_deps 11284 11544 +2.30% +260 OK
codedb_edit 7108 6822 -4.02% -286 OK
codedb_find 67657 65549 -3.12% -2108 OK
codedb_hot 111318 116402 +4.57% +5084 OK
codedb_outline 340446 340942 +0.15% +496 OK
codedb_read 110378 107755 -2.38% -2623 OK
codedb_search 160158 163086 +1.83% +2928 OK
codedb_snapshot 306256 304493 -0.58% -1763 OK
codedb_status 14712 18389 +24.99% +3677 NOISE
codedb_symbol 64101 64695 +0.93% +594 OK
codedb_tree 85536 87783 +2.63% +2247 OK
codedb_word 94670 93577 -1.15% -1093 OK

@justrach justrach merged commit 9358fe8 into main May 21, 2026
1 of 2 checks passed
justrach added a commit that referenced this pull request May 21, 2026
… security

Bumps semver to 0.2.5817. Bundles the v0.2.5816 perf+security release
(PRs #484, #485, #483, #486, #487) with the experiment/reader-md feature
that auto-prepends a hash-verified codebase map to codedb_context.

Highlights vs v0.2.5815:

  Performance (PR #485, deterministic microbenchmarks):
    Suspense regex p50:    2.82 ms → 0.18 ms  (15.6× faster)
    useState regex p99:   16.57 ms → 2.04 ms  (8.1× p99 reduction)

  CLI surface (PR #484):
    + codedb read <path> [-L FROM-TO] [--compact]
    + path-safety + sensitive-file guards
    + project-root anchoring (uses configured root, not cwd)

  codedb_context (NEW in 0.2.5817):
    + auto-prepends .codedb/reader.md when source_hash matches
    + inline ~6 lines of body for ≤3 symbol_definitions
    + new "## Callers" section pre-surfaces execution sites
    + skip-on-short-task gate (≤80 chars) to avoid overhead on narrow lookups

  reader.md security (this branch):
    + path-traversal blocked (no absolute / .. in source_files)
    + source_files capped at 20 (DoS guard)
    + loc_actual capped at 240 (body bloat guard)
    + golden blake2b roundtrip test

Eval (Sonnet 4.6, n=3 per task, vs v0.2.5815 main lineage):
  T1 flask median:   5 → 4  (-1)
  T2 regex median:  13 → 7  (-6)
  T3 react median:  13 → 10 (-3)

All 9 runs across the matrix returned correct answers. Branch wins on
median, mode, and best-case for every task.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@justrach justrach deleted the feat/bench-codegraph-v0.2.5815 branch May 21, 2026 06:31
shabarkin pushed a commit to shabarkin/codedb that referenced this pull request May 22, 2026
Wires the codegraph 0.7.10 backend into the single-session + multi-session
launcher alongside codedb / fts5_tri / fts5_uni / lean-ctx. Uses
`codegraph serve --mcp` as a long-lived stdio child and invokes
`codegraph_search` as the default symbol-lookup tool — apples-to-apples
with codedb_search.

New CLI flags:
  --codegraph-bin <path>   default: $(which codegraph)
  --skip-codegraph         skip the backend entirely
  --clean-codegraph        wipe matching .codegraph/ before indexing

Cold-index helper `codegraph_cold_index` invokes `codegraph init` then
`codegraph index` and measures wall-clock + .codegraph/ on-disk size.

Smoke-tested codegraph-only on flask:
  cold build: 0.57 s, ~3.7 MB
  warm queries: 0.2–2 ms p50 (matches the bench numbers from the
  v0.2.5815 cross-corpus run committed in PR justrach#483)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant