Skip to content

bench: add VACUUM step to ParadeDB benchmarks for segment compaction#233

Merged
tjgreen42 merged 1 commit intomainfrom
bench/paradedb-vacuum-compaction
Feb 19, 2026
Merged

bench: add VACUUM step to ParadeDB benchmarks for segment compaction#233
tjgreen42 merged 1 commit intomainfrom
bench/paradedb-vacuum-compaction

Conversation

@tjgreen42
Copy link
Copy Markdown
Collaborator

@tjgreen42 tjgreen42 commented Feb 19, 2026

Summary

  • Add VACUUM after CREATE INDEX in ParadeDB (System X) MS MARCO v1 and v2 load scripts to trigger segment compaction
  • Include VACUUM time in the reported index_build_time_ms metric via an INDEX_VACUUM: marker parsed by extract_metrics.sh

ParadeDB merges index segments during VACUUM, so without this step we may be benchmarking it in a suboptimal state with uncompacted segments.

Testing

  • Run MS MARCO v1 benchmark with ParadeDB and compare index size + query latency before/after this change

ParadeDB merges index segments during VACUUM, so running VACUUM after
CREATE INDEX ensures the index is fully compacted before querying.
Without this, we may be benchmarking ParadeDB in a suboptimal state.

The VACUUM time is included in the reported index_build_time_ms via an
INDEX_VACUUM: marker that extract_metrics.sh sums with CREATE INDEX time.
@tjgreen42 tjgreen42 marked this pull request as ready for review February 19, 2026 17:02
@tjgreen42 tjgreen42 merged commit a35abfe into main Feb 19, 2026
5 checks passed
@tjgreen42 tjgreen42 deleted the bench/paradedb-vacuum-compaction branch February 19, 2026 17:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant