-
Notifications
You must be signed in to change notification settings - Fork 1.4k
feat: Add NonMaterializedSortBuffer #15157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
✅ Deploy Preview for meta-velox canceled.
|
27dc9b0 to
ef59320
Compare
16f988c to
606779e
Compare
xiaoxmeng
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@duanmeng LGTM % comments. Thanks!
|
|
||
| int32_t outputRow = 0; | ||
| int32_t outputSize = 0; | ||
| bool isEndOfBatch = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/isEndOfBatch/endOfBatch/
| int32_t outputRow = 0; | ||
| int32_t outputSize = 0; | ||
| bool isEndOfBatch = false; | ||
| while (outputRow + outputSize < output_->size()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/outputSize/bufferedOutputRows/
|
|
||
| void HybridSortBuffer::prepareOutputWithSpill() { | ||
| VELOX_CHECK(hasSpilled()); | ||
| if (inputSpiller_ != nullptr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if () {
...
return;
}
...
| outputSpillPartitionSet_.begin()->second->createUnorderedReader( | ||
| spillConfig_->readBufferSize, pool(), spillStats_); | ||
| } | ||
| outputSpillPartitionSet_.clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this belong to the else condition?
468ca20 to
027a584
Compare
xiaoxmeng
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@duanmeng LGTM % minors. Thanks!
a2704bc to
0b7b165
Compare
|
@duanmeng : Thanks for this PR. Please can you explain the motivation for this change. |
@aditi-pandit Hi Aditi, the |
Thanks for your explanation. I got the infrastructure pieces. Which operators are we planning to enhance with this ? Do you have some early results to share ? |
Add a NonMaterializedSortBuffer, which materializes only the sort key columns
and additional index columns into a RowContainer for sorting. After the sort, it
uses these indices to gather and copy the corresponding rows from the original
input vectors into the output vector. This reduces the overhead of materializing
payload columns into the RowContainer, which is especially beneficial for wide
table scenarios. Support for spilling will be added in a follow-up PR.