perf: use pointer indirection for BMW term state ordering by tjgreen42 · Pull Request #249 · timescale/pg_textsearch

tjgreen42 · 2026-03-03T17:23:19Z

Summary

Changes TpTermState *terms (contiguous struct array) to TpTermState **terms (array of pointers) in the BMW scoring engine
restore_ordering now moves 8-byte pointers via memmove instead of ~200-byte TpTermState structs (~25x reduction in bytes moved)
All 13 internal functions updated with mechanical &terms[i] → terms[i] and .field → ->field changes

Motivation

Profiling multi-token queries (5-8 tokens) on 138M MS-MARCO v2 passages showed restore_ordering consuming 21.7% of CPU time. The TpTermState struct is ~200 bytes due to the embedded TpSegmentPostingIterator (which contains TpDictEntry, TpSkipEntry, TpSegmentDirectAccess, etc.). Every time a term advances in the WAND traversal, the sorted order is restored by memmove-ing these large structs.

Test plan

All 48 regression tests pass (make installcheck)
CI passes (compile, format, sanitizer)
Benchmark on MS-MARCO v2 to measure latency improvement on 5-8 token queries

Change the TpTermState array from contiguous structs to an array of pointers. This makes restore_ordering swap 8-byte pointers via memmove instead of ~200-byte TpTermState structs, reducing CPU overhead for the sorted-order maintenance in the WAND traversal hot loop. Profiling on 138M MS-MARCO v2 passages showed restore_ordering at 21.7% of CPU time for multi-token queries (5-8 tokens). The large struct size (~200 bytes due to embedded TpSegmentPostingIterator) made each memmove expensive.

## Summary - Update comparison page with results from benchmark run [22642807624](https://github.com/timescale/pg_textsearch/actions/runs/22642807624) - Overall throughput improved from 2.8x to 3.2x faster than System X - Build time gap narrowed from 2.0x to 1.6x (270s → 234s) - Key improvements since Feb 9: SIMD bitpack decoding (#250), stack-allocated decode buffers (#253), BMW term state pointer indirection (#249), arena allocator rewrite (#231), leader-only merge (#244) ## Testing - Numbers extracted from benchmark run on commit 1b09cc9 - gh-pages branch also needs updating (will push after merge)

tjgreen42 added 2 commits March 3, 2026 17:23

Merge branch 'main' into optimize/pointer-indirection-bmw

927c7e5

tjgreen42 merged commit 01a7044 into main Mar 3, 2026
15 checks passed

tjgreen42 deleted the optimize/pointer-indirection-bmw branch March 3, 2026 18:44

tjgreen42 mentioned this pull request Mar 3, 2026

docs: update benchmark comparison with March 3 results #255

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: use pointer indirection for BMW term state ordering#249

perf: use pointer indirection for BMW term state ordering#249
tjgreen42 merged 2 commits intomainfrom
optimize/pointer-indirection-bmw

tjgreen42 commented Mar 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tjgreen42 commented Mar 3, 2026

Summary

Motivation

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant