Releases · timescale/pg_textsearch

@tjgreen42

What's Changed

Add missing upgrade paths: 0.5.1→0.6.0 and 0.6.1→1.0.0-dev by @tjgreen42 in #282
fix: quoted identifiers with uppercase letters cause index lookup failures by @tjgreen42 in #286
fix: address security review findings for 1.0 GA by @tjgreen42 in #270
fix: attnum drift causes index mismatch with inheritance/hypertables (#288) by @tjgreen42 in #289
docs: expand README limitations and fix stale docs by @tjgreen42 in #287
Remove dead Python validation dependencies by @tjgreen42 in #295
fix: crash recovery fails with "Invalid docid page magic" (#291) by @tjgreen42 in #292
bench: add pgbench concurrent throughput and rename SystemX to ParadeDB by @tjgreen42 in #297
Release v1.0.0 by @tjgreen42 in #296

Full Changelog: v0.6.1...v1.0.0

Bug Fixes

Fix crash on ROLLBACK after error in transaction block (#279) — Any server with pg_textsearch in shared_preload_libraries could crash when running ROLLBACK after an error within a transaction block, even in databases without the extension installed. The planner hook attempted catalog lookups in an aborted transaction state, triggering ResourceOwnerEnlarge failures on release builds and assertion failures on debug builds.
VACUUM correctly removes dead index entries (#267)
Widen TpDictEntry.block_count from uint16 to uint32 (#266) — Prevents overflow for indexes with more than 65,535 blocks per term.

Performance

Cache skip entries and compressed buffer in BMW inner loop (#274) — Reduces repeated allocations during top-k query execution.

Testing

Add transaction abort test suite — 8 scenarios covering ROLLBACK, SAVEPOINT, and error recovery with and without BM25 indexes.
Add physical and logical replication tests (#263)
Add MS-MARCO v2 ground truth validation (#268)

Upgrade

Supported upgrade paths: 0.2.0 → 0.6.1, 0.3.0 → 0.6.1, 0.4.x → 0.6.1, 0.5.x → 0.6.1, 0.6.0 → 0.6.1.

ALTER EXTENSION pg_textsearch UPDATE TO '0.6.1';

A server restart is required after installing the new binary.

@tjgreen42

What's Changed

Numerous performance improvements and stability fixes addressing workloads at scale
- On MS-MARCO v2 (133M documents, 47GB base data), query throughput is now 3.5X higher than the leading Postgres-based BM25 extension, while indexing performance is also greatly improved, with much smaller RAM consumption
- See overview at https://timescale.github.io/pg_textsearch/benchmarks/comparison.html
pg_textsearch now requires an entry in shared_load_libraries
- This addresses a variety of stability issues arising from mismatched library versions during upgrade

Gory Details

bench: add MS MARCO v2 dataset and weighted-average latency metric by @tjgreen42 in #226
feat: widen segment offsets from uint32 to uint64 (V4 format) by @tjgreen42 in #220
chore: improve code coverage toward 90% by @tjgreen42 in #222
fix: use min fieldnorm for BMW skip entries in parallel build by @tjgreen42 in #230
bench: add VACUUM step to ParadeDB benchmarks for segment compaction by @tjgreen42 in #233
feat: version the shared library filename by @tjgreen42 in #232
feat: reduce build chatter for partitioned tables by @tjgreen42 in #214
fix: resolve test failures on PG17 and /tmp environments by @tjgreen42 in #234
Revert versioned shared library filename by @tjgreen42 in #238
fix: TOCTOU race in parallel build loses documents by @tjgreen42 in #240
fix: run BM25 validation in benchmark CI workflow by @tjgreen42 in #239
feat: require shared_preload_libraries for pg_textsearch by @tjgreen42 in #235
feat: detect stale binary after upgrade via library version check by @tjgreen42 in #241
feat: rewrite index build with arena allocator and parallel page pool by @tjgreen42 in #231
fix: add coverage gate to block PRs on coverage reduction by @tjgreen42 in #245
feat: leader-only merge for parallel index build by @tjgreen42 in #244
bench: add insert benchmarks; fix insert performance regression by @tjgreen42 in #242
fix: resolve all compiler warnings in extension source by @tjgreen42 in #246
fix: crash when creating BM25 index on temp table by @tjgreen42 in #248
perf: use pointer indirection for BMW term state ordering by @tjgreen42 in #249
perf: SIMD-accelerated bitpack decoding by @tjgreen42 in #250
fix: security hardening for user-input-facing code paths by @tjgreen42 in #251
perf: skip empty memtable during query scoring by @tjgreen42 in #252
perf: stack-allocate decode buffers in tp_decompress_block by @tjgreen42 in #253
Release v0.6.0 by @tjgreen42 in #254

Full Changelog: v0.5.1...v0.6.0

Highlights

CREATE INDEX CONCURRENTLY: BM25 indexes now support concurrent index builds, allowing index creation without blocking writes.
WAND pivot selection: Multi-term queries use improved pivot selection for faster top-k retrieval, complementing the existing Block-Max WAND optimization.
Iterative index scans: Queries without an explicit LIMIT now use exponential backoff to avoid scanning the entire index upfront.
Progress reporting: CREATE INDEX now reports progress via pg_stat_progress_create_index, visible in standard Postgres monitoring tools.
Index scan statistics: BM25 index scans now report tuple and page counts via pg_stat counters (visible in pg_stat_user_indexes and EXPLAIN (ANALYZE, BUFFERS)).

Bug Fixes

Fixed BM25 scoring accuracy: block-max upper bounds now use minimum fieldnorm instead of average, preventing valid matches from being incorrectly skipped by BMW optimization.
Fixed implicit text <@> text operator resolution in DML subqueries (INSERT...SELECT, UPDATE...FROM, etc.).
Fixed to_bm25query() to reject index names that don't refer to a BM25 index.

Full Changelog: v0.5.0...v0.5.1

Highlights

Parallel index builds: CREATE INDEX now uses multiple workers for faster indexing of large tables. Postgres automatically allocates workers based on table size and max_parallel_maintenance_workers setting.
Improvements for bm25 indexes on hypertables
Stability fixes

Notes

This is expected to be the last pre-release before GA (v1.0.0).

Full Changelog: v0.4.0...v0.5.0

@tjgreen42

What's Changed

Backport #168: Fix zero scores when querying hypertables by @tjgreen42 in #175

@tjgreen42

Fix 'too many LWLocks taken' error when scanning hypertables by @tjgreen42 in #167

v0.4.0 Release Notes

Highlights

Posting List Compression — Indexes are now 41% smaller thanks to delta encoding and bitpacking. Compression is enabled by default (pg_textsearch.compress_segments = on) and also improves query performance by 10-20% on short queries due to better cache efficiency.

Other Changes

Fix for partitioned tables: Resolved "too many LWLocks taken" error when querying partitioned tables with many partitions (#135)
Better planner integration: Implement AMPROP_DISTANCE_ORDERABLE so the planner knows BM25 indexes support ordered scans (#133)
Static analysis: Added Coverity scanning to CI and fixed all reported issues (#128, #130)

Compatibility

⚠️ Breaking change: Segment format v3 (compression) is incompatible with v0.3.0 and earlier. Indexes created with older versions must be dropped and recreated after upgrading:

DROP INDEX my_bm25_idx;
CREATE INDEX my_bm25_idx ON my_table USING bm25(content) WITH (text_config='english');

Links

@TheAifam5

What's New

Query Performance

Block-Max WAND optimization delivers up to 4x faster top-k queries by skipping blocks that can't contribute to results. This is the same algorithm used by Lucene and other production search engines.

Single-term and multi-term queries benefit from block-level score upper bounds
New GUC pg_textsearch.enable_bmw (default: on) controls the optimization
New GUC pg_textsearch.log_bmw_stats for debugging block skip statistics

Scalability

Fixed memory leaks during index builds by using private DSA allocations
Reduced default memtable spill threshold for more predictable memory usage

Bug Fixes

Fixed bm25query binary I/O and scoring with detoasted varlena values
Fixed missing math.h include that caused build failures on some platforms
Fixed compilation warnings

Under the Hood

Added competitive benchmark suite comparing against leading Postgres-based text search extension
Added nightly stress tests with memory leak detection
Added segment integrity tests
Added code coverage reporting via Codecov
Code cleanup: removed dead code, consolidated duplicates

New Contributors

@TheAifam5 made their first contribution

Full Changelog: v0.2.0...v0.3.0

@tjgreen42

Highlights

Automated benchmark infrastructure
Lays groundwork for storage and query optimizations in upcoming releases
Numerous bugfixes

Gory Details

Add public benchmark suite with MS MARCO and Wikipedia by @tjgreen42 in #66
Fix excessive memory allocation in document scoring by @tjgreen42 in #68
Run benchmarks on-demand and weekly (not on every PR) by @tjgreen42 in #69
Pin Python to 3.10 for wikiextractor compatibility by @tjgreen42 in #70
Add path filters to CI workflows by @tjgreen42 in #72
Fix benchmark dataset labeling by @tjgreen42 in #71
Fix JSON generation in extract_metrics.sh by @tjgreen42 in #73
Run benchmark queries repeatedly for stable measurements by @tjgreen42 in #74
Extract and publish metrics per-dataset when running all benchmarks by @tjgreen42 in #75
Improve benchmark configuration and add index size tracking by @tjgreen42 in #76
Improve benchmark dashboard: dataset sizes and compact layout by @tjgreen42 in #77
Reclaim pages after segment compaction by @tjgreen42 in #78
Add storage and query optimization roadmap by @tjgreen42 in #79
Implement V2 segment format with block storage for BMW optimization by @tjgreen42 in #81
Style benchmark graph points by branch type by @tjgreen42 in #84
Fix V2 segment query performance regression by @tjgreen42 in #87
Replace fixed-size registry with dshash for unlimited indexes by @tjgreen42 in #85
fix: buildempty() should write init fork by @SteveLauC in #89
Update v1.0.0 target date to Feb 2026 by @tjgreen42 in #97
chore: fix make test and override pgxs installcheck by @SteveLauC in #91
Refactor codebase to better reflect architectural structure by @tjgreen42 in #99
Release v0.2.0 by @tjgreen42 in #100

New Contributors

@SteveLauC made their first contribution in #89

Full Changelog: v0.1.0...v0.2.0

Releases: timescale/pg_textsearch

v1.0.0

What's Changed

Contributors

Uh oh!

v0.6.1

Bug Fixes

Performance

Testing

Upgrade

Uh oh!

Release 0.6.0

What's Changed

Gory Details

Contributors

Uh oh!

v0.5.1

Highlights

Bug Fixes

Uh oh!

v0.5.0

Highlights

Notes

Uh oh!

v0.4.2

What's Changed

Contributors

Uh oh!

v0.4.1

Contributors

Uh oh!

v0.4.0

v0.4.0 Release Notes

Highlights

Other Changes

Compatibility

Links

Uh oh!

v0.3.0

What's New

Query Performance

Scalability

Bug Fixes

Under the Hood

New Contributors

Contributors

Uh oh!

v0.2.0

Highlights

Gory Details

New Contributors

Contributors

Uh oh!