Issue #5023: Window Radix Partitions#5909
Conversation
hawkfish
commented
Jan 13, 2023
- Switch to using Laurens' radix partitioned column data collections.
- Avoid recreating the RowDataCollectionScanner by adding a Reset method.
Switch to using Laurens' radix partitioned column data collections.
Avoid recreating the RowDataCollectionScanner by adding a Reset method. Tidy up some bits of code.
Adapt to the incoming data sizes instead of guessing badly and falling back to 1024 buckets. This was causing some performance regressions in TPC-DS.
Make test deterministic.
lnkuiper
left a comment
There was a problem hiding this comment.
Very nice performance improvements! I am happy to see that PartitionedColumnData is being put to use. The code looks great.
In general, I really like the new flow of the Window operator, although I don't understand how the sorting is parallelized, and how large the sorted runs are.
What I got from this so far is that during Sink the data is collected in PartitionedColumnData, and during Combine these are combined.
This is great because you can now start Finalize/GetData knowing exactly how much data you have, and how it is distributed across hash groups.
I wonder if it's better to always go for 1024 hash groups, and combine them if they're too small. This prevents re-materializing the already materialized data. However, this will slow down the appends slightly, so it's a trade-off.
However, this is very flexible. You can merge partitions until their combined size is greater than some threshold, regardless of radix bits. For example, if you merge based on radix bits, e.g., going from 4 to 3, then you will merge 0101 and 1101. However, if you always over-partition, and just merge partitions together how you see fit, you can merge 0101 with 0001, if that results in more balanced hash groups. Let's discuss this later! :)
* Use partition size computer * Use ColumnDataConsumer to release data faster * Acquiesce to clangd whinging.
lnkuiper
left a comment
There was a problem hiding this comment.
Thanks for the changes! I think this is ready to go
