bench: add insert benchmarks; fix insert performance regression#242
Merged
bench: add insert benchmarks; fix insert performance regression#242
Conversation
Add INSERT_TIME and CONCURRENT_INSERT_TIME marker extraction to extract_metrics.sh. Add corresponding fields to format_for_action.sh for dashboard publishing.
Add two new jobs and extend system-x-benchmark: - insert-benchmark: pg_textsearch single-txn inserts and concurrent pgbench inserts across all three datasets, with validation and dashboard publishing for 6 metric prefixes - baseline-benchmark: GIN+tsvector using built-in Postgres (no extensions), covering CREATE INDEX on populated table, single-txn insert, and concurrent insert for all datasets with 9 metric prefixes - system-x-benchmark: extended with insert-load and concurrent pgbench insert steps for all datasets, plus format/publish steps for the new insert and concurrent metric variants
Three bugs fixed: - Section headers for concurrent inserts lacked "Benchmark" keyword, making them invisible to the awk section filter in extract_metrics.sh - Single-dataset runs used lowercase dataset slug in section matching (e.g. "cranfield Insert" vs "Cranfield Insert") causing case-sensitive mismatch - Cranfield insert timing captured only one INSERT's time (~15ms) instead of total; now uses clock_timestamp() bookends around all 1400 inserts - Added bm25_spill_index() before index size measurement in insert benchmarks so pg_relation_size() reflects actual data (not just the memtable metapage)
tp_calculate_idf_sum() was called on every tp_insert(), scanning the entire term hash table to recompute a sum that was never read. Profiling showed this consumed 94% of insert time: with 100K+ terms in the memtable, each row insert triggered a full sequential scan of the dshash, making insert performance O(docs * terms). The idf_sum field in TpSharedIndexState and total_terms in TpMemtable were dead code — written but never read by any query or scoring path. Remove the function, both fields, and all call sites (insert path, build finalization, recovery). The on-disk metapage total_terms field is left in place to preserve the page layout for existing indexes.
fbbc46b to
baa7e90
Compare
- Mark total_terms in metapage as "unused, retained for on-disk compat" - Increase INSERT_TIME grep window from -A 5 to -A 10 for robustness
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
complementing the existing CREATE INDEX benchmarks
tp_calculate_idf_sum()function that was called per-insertand scanning the entire term dictionary — caused inserts to be O(docs * terms)
and MS MARCO insert to time out at 6+ hours
load patterns:
Dead code removed
tp_calculate_idf_sum()build.cidf_sumTpSharedIndexStatetotal_termsTpMemtabletotal_termsis retained in the on-disk metapage struct for layout compatibility.New workflow jobs
insert-benchmarkbaseline-benchmarksystem-x-benchmark(extended)Benchmark results (MS MARCO, 8.8M docs)
Testing