Skip to content

feat: reduce build chatter for partitioned tables#214

Merged
tjgreen42 merged 1 commit intomainfrom
reduce-build-chatter
Feb 19, 2026
Merged

feat: reduce build chatter for partitioned tables#214
tjgreen42 merged 1 commit intomainfrom
reduce-build-chatter

Conversation

@tjgreen42
Copy link
Copy Markdown
Collaborator

@tjgreen42 tjgreen42 commented Feb 9, 2026

Summary

  • Collapse per-partition NOTICE output into a single summary when building BM25 indexes on partitioned tables
  • Non-partitioned tables are unaffected (same output as before)
  • Simplify "completed" messages to not repeat config already shown in "Using ..." lines
  • Aggregate Postgres core's "word is too long" warnings into a single count

Before (200-partition table, ~800 lines of output):

NOTICE:  BM25 index build started for relation part_0_content_idx
NOTICE:  Using text search configuration: english
NOTICE:  Using index options: k1=1.20, b=0.75
NOTICE:  BM25 index build completed: 1 documents, avg_length=3.00, text_config='english' (k1=1.20, b=0.75)
NOTICE:  BM25 index build started for relation part_1_content_idx
NOTICE:  Using text search configuration: english
NOTICE:  Using index options: k1=1.20, b=0.75
NOTICE:  BM25 index build completed: 1 documents, avg_length=3.00, text_config='english' (k1=1.20, b=0.75)
... (796 more lines)

After (4 lines):

NOTICE:  BM25 index build started for relation part_0_content_idx
NOTICE:  Using text search configuration: english
NOTICE:  Using index options: k1=1.20, b=0.75
NOTICE:  BM25 index build completed: 200 documents across 200 partitions, avg_length=3.00

Approach

  • ProcessUtility_hook in mod.c detects CREATE INDEX ... USING bm25 and wraps the build with progress tracking (tp_build_progress_begin/end)
  • emit_log_hook in mod.c intercepts "word is too long" NOTICEs during active builds, counting them for a single aggregated warning
  • tp_build() shows "started" and config NOTICEs only for the first partition, accumulates stats across all partitions, and emits one summary at the end
  • REINDEX, VACUUM FULL, and other non-CREATE-INDEX paths are unaffected

Testing

  • partitioned_many.out reduced from ~848 lines to ~51 lines
  • "completed" messages simplified across all tests (config info no longer repeated)

@tjgreen42 tjgreen42 force-pushed the reduce-build-chatter branch from 06bb4c6 to 4577186 Compare February 9, 2026 23:43
@tjgreen42 tjgreen42 force-pushed the reduce-build-chatter branch 2 times, most recently from 60b6d79 to c03e327 Compare February 19, 2026 01:35
@tjgreen42 tjgreen42 marked this pull request as ready for review February 19, 2026 16:43
When CREATE INDEX targets a partitioned table, Postgres calls tp_build()
once per partition. Previously this emitted "started", config, and
"completed" NOTICEs for every partition -- producing hundreds of lines
for tables with many partitions.

Add a ProcessUtility hook that detects CREATE INDEX ... USING bm25 and
wraps the entire operation in a build-progress tracker. The tracker:
- Shows "started" and config NOTICEs only for the first partition
- Accumulates doc counts and avg_length across all partitions
- Emits a single "completed" summary with "across N partitions"

Non-partitioned tables are unaffected (same output as before).

Also simplify "completed" messages to not repeat text_config/k1/b
parameters that are already shown in the "Using ..." lines above.
@tjgreen42 tjgreen42 force-pushed the reduce-build-chatter branch from 9552405 to 4d08c7f Compare February 19, 2026 16:51
@tjgreen42 tjgreen42 merged commit 77d4736 into main Feb 19, 2026
26 of 28 checks passed
@tjgreen42 tjgreen42 deleted the reduce-build-chatter branch February 19, 2026 17:23
tjgreen42 added a commit that referenced this pull request Feb 25, 2026
## Summary

- Fix a TOCTOU race condition in `tp_leader_process_buffers()` that can
silently lose documents during parallel index builds. The leader
collected ready buffers *before* checking `workers_done`, allowing a
window where a worker marks its final buffer READY and increments
`workers_done` between the two reads — causing the leader to exit
without processing the final buffer.
- Fix a latent deadlock where `shared->nworkers` was set to the
*requested* worker count but never updated to the *actually launched*
count. If Postgres launched fewer workers, `workers_done` could never
reach `nworkers`.
- Update expected test output for the build-completed NOTICE format
change from #214.

### The race

```
Leader:                              Worker (finishing):
                                     1. Mark final buffer READY
                                     2. SpinLock → workers_done++
1. Collect buffers → num_ready=0
   (missed the READY buffer)
2. Check all_done → true
3. num_ready(0) > 0 → false → skip
4. while(!all_done) → EXIT
   *** FINAL BUFFER NEVER PROCESSED ***
```

### The fix

Check `workers_done` *first* (under SpinLock, providing the memory
barrier), *then* collect buffers. Workers set buffer READY before the
SpinLock acquire around `workers_done++`, so after the leader's SpinLock
release, all final buffers are guaranteed visible. The loop now exits
only when `all_done && num_ready == 0`.

### Impact

On MS-MARCO (8.8M passages), this race non-deterministically lost ~4M
documents (~45% of the corpus). One CI run indexed only 4,834,444
documents while the same data loaded correctly in another run (8,841,770
documents). The root cause was discovered while investigating benchmark
validation failures in #239.

## Testing

Regression tests pass (47/47). The race is non-deterministic so it
cannot be reliably reproduced in unit tests — validation should be done
via the MS-MARCO benchmark pipeline which exercises parallel builds at
scale.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant