Releases: timescale/pg_textsearch
v1.0.0
What's Changed
- Add missing upgrade paths: 0.5.1→0.6.0 and 0.6.1→1.0.0-dev by @tjgreen42 in #282
- fix: quoted identifiers with uppercase letters cause index lookup failures by @tjgreen42 in #286
- fix: address security review findings for 1.0 GA by @tjgreen42 in #270
- fix: attnum drift causes index mismatch with inheritance/hypertables (#288) by @tjgreen42 in #289
- docs: expand README limitations and fix stale docs by @tjgreen42 in #287
- Remove dead Python validation dependencies by @tjgreen42 in #295
- fix: crash recovery fails with "Invalid docid page magic" (#291) by @tjgreen42 in #292
- bench: add pgbench concurrent throughput and rename SystemX to ParadeDB by @tjgreen42 in #297
- Release v1.0.0 by @tjgreen42 in #296
Full Changelog: v0.6.1...v1.0.0
v0.6.1
Bug Fixes
- Fix crash on ROLLBACK after error in transaction block (#279) — Any server with
pg_textsearchinshared_preload_librariescould crash when running ROLLBACK after an error within a transaction block, even in databases without the extension installed. The planner hook attempted catalog lookups in an aborted transaction state, triggeringResourceOwnerEnlargefailures on release builds and assertion failures on debug builds. - VACUUM correctly removes dead index entries (#267)
- Widen
TpDictEntry.block_countfrom uint16 to uint32 (#266) — Prevents overflow for indexes with more than 65,535 blocks per term.
Performance
- Cache skip entries and compressed buffer in BMW inner loop (#274) — Reduces repeated allocations during top-k query execution.
Testing
- Add transaction abort test suite — 8 scenarios covering ROLLBACK, SAVEPOINT, and error recovery with and without BM25 indexes.
- Add physical and logical replication tests (#263)
- Add MS-MARCO v2 ground truth validation (#268)
Upgrade
Supported upgrade paths: 0.2.0 → 0.6.1, 0.3.0 → 0.6.1, 0.4.x → 0.6.1, 0.5.x → 0.6.1, 0.6.0 → 0.6.1.
ALTER EXTENSION pg_textsearch UPDATE TO '0.6.1';A server restart is required after installing the new binary.
Release 0.6.0
What's Changed
- Numerous performance improvements and stability fixes addressing workloads at scale
- On MS-MARCO v2 (133M documents, 47GB base data), query throughput is now 3.5X higher than the leading Postgres-based BM25 extension, while indexing performance is also greatly improved, with much smaller RAM consumption
- See overview at https://timescale.github.io/pg_textsearch/benchmarks/comparison.html
- pg_textsearch now requires an entry in shared_load_libraries
- This addresses a variety of stability issues arising from mismatched library versions during upgrade
Gory Details
- bench: add MS MARCO v2 dataset and weighted-average latency metric by @tjgreen42 in #226
- feat: widen segment offsets from uint32 to uint64 (V4 format) by @tjgreen42 in #220
- chore: improve code coverage toward 90% by @tjgreen42 in #222
- fix: use min fieldnorm for BMW skip entries in parallel build by @tjgreen42 in #230
- bench: add VACUUM step to ParadeDB benchmarks for segment compaction by @tjgreen42 in #233
- feat: version the shared library filename by @tjgreen42 in #232
- feat: reduce build chatter for partitioned tables by @tjgreen42 in #214
- fix: resolve test failures on PG17 and /tmp environments by @tjgreen42 in #234
- Revert versioned shared library filename by @tjgreen42 in #238
- fix: TOCTOU race in parallel build loses documents by @tjgreen42 in #240
- fix: run BM25 validation in benchmark CI workflow by @tjgreen42 in #239
- feat: require shared_preload_libraries for pg_textsearch by @tjgreen42 in #235
- feat: detect stale binary after upgrade via library version check by @tjgreen42 in #241
- feat: rewrite index build with arena allocator and parallel page pool by @tjgreen42 in #231
- fix: add coverage gate to block PRs on coverage reduction by @tjgreen42 in #245
- feat: leader-only merge for parallel index build by @tjgreen42 in #244
- bench: add insert benchmarks; fix insert performance regression by @tjgreen42 in #242
- fix: resolve all compiler warnings in extension source by @tjgreen42 in #246
- fix: crash when creating BM25 index on temp table by @tjgreen42 in #248
- perf: use pointer indirection for BMW term state ordering by @tjgreen42 in #249
- perf: SIMD-accelerated bitpack decoding by @tjgreen42 in #250
- fix: security hardening for user-input-facing code paths by @tjgreen42 in #251
- perf: skip empty memtable during query scoring by @tjgreen42 in #252
- perf: stack-allocate decode buffers in tp_decompress_block by @tjgreen42 in #253
- Release v0.6.0 by @tjgreen42 in #254
Full Changelog: v0.5.1...v0.6.0
v0.5.1
Highlights
-
CREATE INDEX CONCURRENTLY: BM25 indexes now support concurrent index builds, allowing index creation without blocking writes.
-
WAND pivot selection: Multi-term queries use improved pivot selection for faster top-k retrieval, complementing the existing Block-Max WAND optimization.
-
Iterative index scans: Queries without an explicit LIMIT now use exponential backoff to avoid scanning the entire index upfront.
-
Progress reporting:
CREATE INDEXnow reports progress viapg_stat_progress_create_index, visible in standard Postgres monitoring tools. -
Index scan statistics: BM25 index scans now report tuple and page counts via
pg_statcounters (visible inpg_stat_user_indexesandEXPLAIN (ANALYZE, BUFFERS)).
Bug Fixes
-
Fixed BM25 scoring accuracy: block-max upper bounds now use minimum fieldnorm instead of average, preventing valid matches from being incorrectly skipped by BMW optimization.
-
Fixed implicit
text <@> textoperator resolution in DML subqueries (INSERT...SELECT, UPDATE...FROM, etc.). -
Fixed
to_bm25query()to reject index names that don't refer to a BM25 index.
Full Changelog: v0.5.0...v0.5.1
v0.5.0
Highlights
-
Parallel index builds: CREATE INDEX now uses multiple workers for faster indexing of large tables. Postgres automatically allocates workers based on table size and
max_parallel_maintenance_workerssetting. -
Improvements for bm25 indexes on hypertables
-
Stability fixes
Notes
This is expected to be the last pre-release before GA (v1.0.0).
Full Changelog: v0.4.0...v0.5.0
v0.4.2
v0.4.1
v0.4.0
v0.4.0 Release Notes
Highlights
Posting List Compression — Indexes are now 41% smaller thanks to delta encoding and bitpacking. Compression is enabled by default (pg_textsearch.compress_segments = on) and also improves query performance by 10-20% on short queries due to better cache efficiency.
Other Changes
- Fix for partitioned tables: Resolved "too many LWLocks taken" error when querying partitioned tables with many partitions (#135)
- Better planner integration: Implement
AMPROP_DISTANCE_ORDERABLEso the planner knows BM25 indexes support ordered scans (#133) - Static analysis: Added Coverity scanning to CI and fixed all reported issues (#128, #130)
Compatibility
DROP INDEX my_bm25_idx;
CREATE INDEX my_bm25_idx ON my_table USING bm25(content) WITH (text_config='english');Links
v0.3.0
What's New
Query Performance
Block-Max WAND optimization delivers up to 4x faster top-k queries by skipping blocks that can't contribute to results. This is the same algorithm used by Lucene and other production search engines.
- Single-term and multi-term queries benefit from block-level score upper bounds
- New GUC
pg_textsearch.enable_bmw(default: on) controls the optimization - New GUC
pg_textsearch.log_bmw_statsfor debugging block skip statistics
Scalability
- Fixed memory leaks during index builds by using private DSA allocations
- Reduced default memtable spill threshold for more predictable memory usage
Bug Fixes
- Fixed bm25query binary I/O and scoring with detoasted varlena values
- Fixed missing
math.hinclude that caused build failures on some platforms - Fixed compilation warnings
Under the Hood
- Added competitive benchmark suite comparing against leading Postgres-based text search extension
- Added nightly stress tests with memory leak detection
- Added segment integrity tests
- Added code coverage reporting via Codecov
- Code cleanup: removed dead code, consolidated duplicates
New Contributors
- @TheAifam5 made their first contribution
Full Changelog: v0.2.0...v0.3.0
v0.2.0
Highlights
- Automated benchmark infrastructure
- Lays groundwork for storage and query optimizations in upcoming releases
- Numerous bugfixes
Gory Details
- Add public benchmark suite with MS MARCO and Wikipedia by @tjgreen42 in #66
- Fix excessive memory allocation in document scoring by @tjgreen42 in #68
- Run benchmarks on-demand and weekly (not on every PR) by @tjgreen42 in #69
- Pin Python to 3.10 for wikiextractor compatibility by @tjgreen42 in #70
- Add path filters to CI workflows by @tjgreen42 in #72
- Fix benchmark dataset labeling by @tjgreen42 in #71
- Fix JSON generation in extract_metrics.sh by @tjgreen42 in #73
- Run benchmark queries repeatedly for stable measurements by @tjgreen42 in #74
- Extract and publish metrics per-dataset when running all benchmarks by @tjgreen42 in #75
- Improve benchmark configuration and add index size tracking by @tjgreen42 in #76
- Improve benchmark dashboard: dataset sizes and compact layout by @tjgreen42 in #77
- Reclaim pages after segment compaction by @tjgreen42 in #78
- Add storage and query optimization roadmap by @tjgreen42 in #79
- Implement V2 segment format with block storage for BMW optimization by @tjgreen42 in #81
- Style benchmark graph points by branch type by @tjgreen42 in #84
- Fix V2 segment query performance regression by @tjgreen42 in #87
- Replace fixed-size registry with dshash for unlimited indexes by @tjgreen42 in #85
- fix: buildempty() should write init fork by @SteveLauC in #89
- Update v1.0.0 target date to Feb 2026 by @tjgreen42 in #97
- chore: fix make test and override pgxs installcheck by @SteveLauC in #91
- Refactor codebase to better reflect architectural structure by @tjgreen42 in #99
- Release v0.2.0 by @tjgreen42 in #100
New Contributors
- @SteveLauC made their first contribution in #89
Full Changelog: v0.1.0...v0.2.0