Fix zero scores when querying hypertables with BM25 index#168
Merged
Fix zero scores when querying hypertables with BM25 index#168
Conversation
The planner hook's plan_has_bm25_indexscan() and replace_scores_in_plan() functions didn't handle CustomScan nodes (like TimescaleDB's ConstraintAwareAppend). This caused the hook to miss BM25 index scans nested inside custom scans, so score expressions weren't replaced with stub functions that retrieve cached scores from the index scan. As a result, standalone scoring was used instead, which looked up the parent hypertable index (which has total_docs = 0), producing zero scores for all results. The fix adds T_CustomScan cases that recurse into cscan->custom_plans to properly detect and process BM25 index scans.
When using standalone BM25 scoring on a hypertable with the parent index
name (e.g., `content <@> to_bm25query('database', 'parent_idx')`), the
code was falling back to child index stats (total_docs, avg_doc_len) but
NOT switching to the child's index relation and segment metadata. This
caused IDF calculation to fail because it was looking up document
frequencies in the parent index's segments (which are empty).
The fix switches to the child index for segment access when falling back
to a child index's state. This ensures that both memtable and segment
lookups use the correct child index.
Also adds:
- Test 5 in partitioned.sql for MergeAppend score expression replacement
- test/scripts/hypertable.sh for optional TimescaleDB integration testing
- Install TimescaleDB for PG 17 and 18 in CI - Configure shared_preload_libraries for both system and test instances - Add hypertable.sh to shell-based tests The test gracefully skips if TimescaleDB installation fails for a specific PG version.
Add WHERE id <= 100 filter to the top-k query. With 150K documents, many have identical scores (same i % 15 pattern), making the result order non-deterministic. Limiting to the first 100 IDs ensures consistent results.
The system PostgreSQL config file path didn't exist. Instead, create a dedicated PostgreSQL instance for shell tests with TimescaleDB configured in shared_preload_libraries.
tjgreen42
added a commit
that referenced
this pull request
Jan 25, 2026
## Summary Backport of #168 to the 0.4.2 release branch. Fixes zero scores when querying hypertables with BM25 indexes: 1. **Planner hook fix for CustomScan** - Add `T_CustomScan` handling in `plan_has_bm25_indexscan()` to detect BM25 index scans nested inside custom scans (e.g., TimescaleDB's ConstraintAwareAppend) 2. **Standalone scoring fix for hypertable parent indexes** - When using standalone BM25 scoring with a hypertable parent index name, the code was falling back to child index stats but NOT switching to the child's index relation and segment metadata ## Testing Cherry-picked from main where it passed all CI checks.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR fixes zero scores when querying hypertables with BM25 indexes.
Commit 1: Planner hook fix for CustomScan
T_CustomScanhandling inplan_has_bm25_indexscan()to detect BM25 index scans nested inside custom scans (e.g., TimescaleDB's ConstraintAwareAppend)T_CustomScanhandling inreplace_scores_in_plan()to replace score expressions in custom scan childrenCommit 2: Standalone scoring fix for hypertable parent indexes
Testing