Skip to content

bench: add MS-MARCO v2 full benchmark and profiling scripts#271

Merged
tjgreen42 merged 2 commits intomainfrom
bench/msmarco-v2-benchmark-scripts
Mar 11, 2026
Merged

bench: add MS-MARCO v2 full benchmark and profiling scripts#271
tjgreen42 merged 2 commits intomainfrom
bench/msmarco-v2-benchmark-scripts

Conversation

@tjgreen42
Copy link
Copy Markdown
Collaborator

Summary

  • Add run_full_benchmark.sh end-to-end orchestrator for reproducible pg_textsearch vs ParadeDB benchmarks on 138M passages
  • Add power test scripts (setup_power_test.sql, power_tapir.sql, power_systemx.sql) for concurrent throughput measurement via pgbench
  • Add profiling tools (profile_build.sh, profile_queries.sql) for perf/flamegraph and per-query latency analysis

Details

run_full_benchmark.sh supports step-based execution:

./run_full_benchmark.sh <step>
  env            - capture machine specs, PG config, extensions
  build-tapir    - build pg_textsearch BM25 index
  build-systemx  - build ParadeDB BM25 index
  query-tapir    - single-client latency benchmarks
  query-systemx  - single-client latency benchmarks
  power-tapir    - concurrent throughput (pgbench)
  power-systemx  - concurrent throughput (pgbench)
  summary        - side-by-side comparison table
  all            - run everything in sequence

Test plan

  • Verify run_full_benchmark.sh env captures system info
  • Run power-tapir step end-to-end on a loaded corpus
  • Run profile_queries.sql and verify latency breakdown output

Add end-to-end benchmark orchestrator (run_full_benchmark.sh) that
runs build, single-client latency, and concurrent throughput (power
test) for both pg_textsearch and ParadeDB on the 138M passage corpus.

Scripts added:
- run_full_benchmark.sh: orchestrator with step-based execution
- setup_power_test.sql: creates pgbench query table with dense IDs
- power_tapir.sql: pgbench script for pg_textsearch throughput
- power_systemx.sql: pgbench script for ParadeDB throughput
- profile_build.sh: perf/flamegraph profiling for index builds
- profile_queries.sql: per-query latency profiling by token bucket
@tjgreen42 tjgreen42 merged commit 272c023 into main Mar 11, 2026
1 check passed
@tjgreen42 tjgreen42 deleted the bench/msmarco-v2-benchmark-scripts branch March 11, 2026 18:52
tjgreen42 added a commit that referenced this pull request Mar 27, 2026
tjgreen42 added a commit that referenced this pull request Mar 27, 2026
…DB (#297)

## Summary
- Add weighted query pool setup and pgbench scripts for concurrent
throughput benchmarking with MS-MARCO v1 lexeme distribution
- Refresh comparison.html with latest benchmark numbers (commit f31d1af)
- Cherry-pick MS-MARCO v2 full benchmark and profiling scripts from
release branch (#271)
- Rename all `systemx` references to `paradedb` across benchmarks, CI
workflow, and docs (from release branch)

## Test plan
- [ ] Run MS-MARCO v2 benchmark locally to verify numbers are comparable
to published results
- [ ] Verify pgbench concurrent throughput scripts work end-to-end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant