-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Closed
Labels
enhancementNew feature or requestNew feature or request
Description
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
The new sort preserving merge operator, introduced in #379 likely has room for performance improvement.
Describe the solution you'd like
- Create a benchmark for the merging operator
- Optimize / improve benchmark as appropriate
Here is a suggestion from @jhorstmann https://github.com/apache/arrow-datafusion/pull/379/files#r637948151 as a separate ticket so it doesn't get lost:
For bigger number of partitions, storing the cursors in a BinaryHeap, sorted by their current item, would be beneficial.
A rust implementation of that approach can be seen in this blog post and the first comment under it. I have implemented the same approach in java before. I agree with @alamb though to make it work first, and then optimize later.
tustvold
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request