Current Status
pg_textsearch today
- 3.1x faster overall query throughput
- Faster on all query lengths (1-8+ tokens)
- Smaller index (no positions stored)*
- Parallel index build (4 workers)
- Native Postgres integration
ParadeDB v0.21.6
- Faster index build (1.7x)
- Phrase queries supported
- Larger feature set (facets, etc.)
Recent Improvements
- BMW cache optimizations - Cached skip entries and reusable decompression buffers reduce per-block overhead by 20–25% (PR #274)
- SIMD-accelerated decoding - Bitpack decoding with SIMD intrinsics (PR #250)
- Stack-allocated decode buffers - Reduced allocation overhead (PR #253)
- BMW term state optimization - Pointer indirection for ordering (PR #249)
- Arena allocator - Rewritten index build with parallel page pool (PR #231)
- Overall throughput - pg_textsearch now 3.1x faster than ParadeDB on 8.8M dataset (up from 2.8x in February)
Index Size & Build Time
| Metric | pg_textsearch | ParadeDB | Difference |
|---|---|---|---|
| Index Size | 1,215 MB | 1,503 MB | -19% |
| Build Time | 233.5 sec | 140.1 sec | +67% |
| Documents | 8,841,823 | - | |
"quick brown fox". ParadeDB stores
positions by default, which adds significant overhead but enables phrase search.
This accounts for most of the index size difference—it's a feature tradeoff, not
a compression advantage.
Query Latency (p50)
Median latency in milliseconds. Lower is better.
| Query Tokens | pg_textsearch | ParadeDB | Difference |
|---|---|---|---|
| 1 token | 0.70 ms | 18.84 ms | -96% |
| 2 tokens | 1.53 ms | 18.40 ms | -92% |
| 3 tokens | 3.00 ms | 24.74 ms | -88% |
| 4 tokens | 4.52 ms | 26.71 ms | -83% |
| 5 tokens | 7.63 ms | 29.16 ms | -74% |
| 6 tokens | 11.17 ms | 35.57 ms | -69% |
| 7 tokens | 16.74 ms | 36.68 ms | -54% |
| 8+ tokens | 26.32 ms | 44.87 ms | -41% |
Query Latency (p95)
95th percentile latency in milliseconds. Lower is better.
| Query Tokens | pg_textsearch | ParadeDB | Difference |
|---|---|---|---|
| 1 token | 1.56 ms | 27.41 ms | -94% |
| 2 tokens | 4.63 ms | 30.99 ms | -85% |
| 3 tokens | 9.22 ms | 37.48 ms | -75% |
| 4 tokens | 14.85 ms | 36.61 ms | -59% |
| 5 tokens | 22.78 ms | 39.81 ms | -43% |
| 6 tokens | 27.15 ms | 58.66 ms | -54% |
| 7 tokens | 42.51 ms | 67.78 ms | -37% |
| 8+ tokens | 57.00 ms | 70.37 ms | -19% |
Throughput
Total time to execute 800 test queries sequentially.
| Metric | pg_textsearch | ParadeDB | Difference |
|---|---|---|---|
| Total time | 8.03 sec | 25.07 sec | -68% |
| Avg ms/query | 10.04 ms | 31.33 ms | -68% |
Analysis
Query latency: pg_textsearch faster across all token counts
pg_textsearch is faster on all 8 token buckets at p50, ranging from 27x faster on single-token queries to 1.7x faster on 8+ token queries. SIMD-accelerated bitpack decoding (PR #250) and stack-allocated decode buffers (PR #253) improved segment read performance, while BMW term state pointer indirection (PR #249) reduced overhead in the query scoring path.
Overall throughput: pg_textsearch 3.1x faster
pg_textsearch completes 800 queries in 8.0s vs 25.1s for ParadeDB, a 3.1x throughput advantage. This is up from 2.8x in the February 9 comparison, driven by segment decoding and scoring path optimizations.
Index build: gap narrowing
ParadeDB builds its index in 140s vs 234s for pg_textsearch (1.7x faster). The arena allocator rewrite (PR #231) and leader-only merge (PR #244) cut build time from 270s to 234s, narrowing the gap from 2.0x to 1.7x.
Methodology
Both extensions benchmarked on identical GitHub Actions runners with the same Postgres configuration. See full methodology for details.
MS-MARCO v2 — 138M Passages
Large-Scale Benchmark
Environment
| Component | Specification |
|---|---|
| CPU | Intel Xeon Platinum 8375C @ 2.90 GHz, 8 cores / 16 threads |
| RAM | 123 GB |
| Storage | NVMe SSD (885 GB) |
| Postgres | 17.7, shared_buffers = 31 GB, data on NVMe |
| Table size | 47 GB (87 GB with TOAST) |
Current Status (138M)
pg_textsearch
- 2.3x faster weighted p50 query latency
- 4.7x higher concurrent throughput (16 clients)
- Faster on all 8 token buckets at p50
- 26% smaller index on disk
- Block-Max WAND with cached skip entries
- SIMD-accelerated bitpack decoding
ParadeDB v0.21.6
- 1.9x faster index build
- Phrase queries supported
- Larger feature set (facets, etc.)
Index Build (138M)
| Metric | pg_textsearch | ParadeDB | Difference |
|---|---|---|---|
| Build time | 17 min 37 s | 8 min 55 s | 1.9x slower |
| Parallel workers | 15 | 14 | - |
| Index size | 17 GB | 23 GB | -26% |
| Documents | 138,364,158 | - | |
| Unique terms | 17,373,764 | - | - |
Single-Client Query Latency (138M)
Top-10 results (LIMIT 10), BMW optimization enabled.
691 queries sampled across 8 token-count buckets.
Median Latency (p50)
| Query Tokens | pg_textsearch | ParadeDB | Speedup |
|---|---|---|---|
| 1 token | 5.11 ms | 59.83 ms | 11.7x |
| 2 tokens | 9.14 ms | 59.65 ms | 6.5x |
| 3 tokens | 20.04 ms | 77.62 ms | 3.9x |
| 4 tokens | 41.92 ms | 98.89 ms | 2.4x |
| 5 tokens | 67.76 ms | 125.38 ms | 1.9x |
| 6 tokens | 102.82 ms | 148.78 ms | 1.4x |
| 7 tokens | 159.37 ms | 169.65 ms | 1.1x |
| 8+ tokens | 177.95 ms | 190.47 ms | 1.1x |
95th Percentile Latency (p95)
| Query Tokens | pg_textsearch | ParadeDB | Speedup |
|---|---|---|---|
| 1 token | 6.43 ms | 68.34 ms | 10.6x |
| 2 tokens | 32.63 ms | 103.17 ms | 3.2x |
| 3 tokens | 51.51 ms | 114.79 ms | 2.2x |
| 4 tokens | 124.17 ms | 147.32 ms | 1.2x |
| 5 tokens | 167.05 ms | 190.07 ms | 1.1x |
| 6 tokens | 262.07 ms | 201.76 ms | 0.77x |
| 7 tokens | 311.58 ms | 291.09 ms | 0.94x |
| 8+ tokens | 404.95 ms | 310.68 ms | 0.77x |
Weighted-Average Latency
Weighted by observed query-length distribution from 1,010,916 MS-MARCO v1 Bing queries after English stopword removal and stemming (mean 3.7 lexemes, mode 3).
Query length distribution (click to expand)
MS-MARCO Query Lexeme Count Distribution (1,010,916 queries)
Lexemes = distinct stems after English stopword removal
lexemes queries % distribution
─────── ──────── ───── ──────────────────────────────────────────────────
0 11 0.0% ▏
1 35,638 3.5% ▓▓▓▓▓
2 165,033 16.3% ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓
3 304,887 30.2% ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓
4 264,177 26.1% ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓
5 143,765 14.2% ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓
6 59,558 5.9% ▓▓▓▓▓▓▓▓▓
7 22,595 2.2% ▓▓▓
8 8,627 0.9% ▓
9 3,395 0.3% ▏
10 1,555 0.2% ▏
11 721 0.1% ▏
12 402 0.0% ▏
13 235 0.0% ▏
14 123 0.0% ▏
15+ 193 0.0% ▏
Total: 1,010,916 queries
Mean: 3.7 lexemes
Mode: 3 lexemes (30.2%)
72.6% of queries have 2-4 lexemes.
96.2% of queries have 1-6 lexemes.
Benchmark buckets 1–7 contain 100 queries each; bucket 8+ contains 38 queries covering all lengths ≥8. Weights applied to each bucket match the distribution above.
| Metric | pg_textsearch | ParadeDB | Speedup |
|---|---|---|---|
| Weighted p50 | 40.61 ms | 94.36 ms | 2.3x |
| Weighted avg | 46.69 ms | 101.66 ms | 2.2x |
Throughput (138M)
Single-Client Sequential
691 queries run 3 times; median iteration reported.
| Metric | pg_textsearch | ParadeDB | Speedup |
|---|---|---|---|
| Avg ms/query | 62.92 ms | 106.53 ms | 1.7x |
| Total (691 queries) | 43.5 s | 73.6 s | 1.7x |
Concurrent (pgbench, 16 clients, 60 s)
| Metric | pg_textsearch | ParadeDB | Ratio |
|---|---|---|---|
| Transactions/sec (TPS) | 91.4 | 19.4 | 4.7x |
| Avg latency | 175 ms | 823 ms | 4.7x |
| Transactions (60 s) | 5,526 | 1,180 | 4.7x |
Analysis (138M)
Query latency: pg_textsearch faster across all token counts
pg_textsearch is faster on all 8 token buckets at p50, ranging from 11.7x faster on single-token queries to 1.1x on 8+ token queries. Cached skip entries and reusable decompression buffers (PR #274) reduced per-block overhead in the WAND inner loop by 20–25%, closing the gap on high-token queries. The weighted p50 advantage is 2.3x.
Tail latency: improved but mixed at p95
pg_textsearch has tighter tail latency on 1–5 token queries at p95. On 6–8+ token queries, ParadeDB still has tighter tails. The p95 gap narrowed significantly with the cache optimizations (e.g., 5-token p95 went from 200ms to 167ms, now faster than ParadeDB's 190ms). Further tail latency optimization on long queries remains an active area of work.
Concurrent throughput: pg_textsearch 4.7x higher TPS
Under 16-client concurrent load, pg_textsearch achieves 91.4 TPS vs 19.4 TPS for ParadeDB — a 4.7x advantage. This is significantly wider than the 1.5x single-client gap, indicating that pg_textsearch scales much better under concurrency. pg_textsearch uses native Postgres buffer management and shared memory, avoiding the external process coordination overhead present in ParadeDB's architecture.
Index build: ParadeDB 1.9x faster
ParadeDB builds its index in 8 min 55 s vs 17 min 37 s for pg_textsearch (1.9x faster). pg_textsearch's parallel build uses 15 workers for the scan phase, but the subsequent merge phase is single-threaded and I/O-bound, accounting for the majority of the build time. Despite the slower build, pg_textsearch produces a 26% smaller index (17 GB vs 23 GB).
Methodology (138M)
Both extensions benchmarked on the same dedicated EC2 instance
(c6i.4xlarge), same Postgres 17.7 installation, same dataset. The
table was loaded once; each extension built its index from scratch with
page cache dropped before each build. Query benchmarks include warmup
passes. The pgbench power test uses -M prepared mode with
random query selection from 691 benchmark queries.