-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-16852: [C++] Migrate remaining kernels to use ExecSpan, remove ExecBatchIterator #13630
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The only CI failures are the linkage issues related to protobuf/otel |
pitrou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice, just a couple minor comments.
Fix some more stuff Remove unused scalar path More refactoring Get everything compiling again Fix some more bugs Fix another bug Remove dead code Fix ArraySpan add/set offset logic Cleaner ArraySpan slicing logic Revert some files
b182b8c to
df1a92e
Compare
|
Thanks for the review. Will merge when CI green |
|
Merging. The CI issues look like assorted flakiness to me |
|
Benchmark runs are scheduled for baseline = c445243 and contender = 4d931ff. 4d931ff is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
This completes the porting to use ExecSpan everywhere. I also changed the ExecBatchIterator benchmarks to use ExecSpan to show the performance improvement in input splitting that we've talked about in the past:
Splitting inputs into small ExecSpan:
Splitting inputs into small ExecBatch:
Because the input in this benchmark has 1M elements, this shows that splitting into 1024 chunks of size 1024 adds only 0.2ms of overhead with ExecSpanIterator versus 17.16ms of overhead with ExecBatchIterator (> 80x improvement).
This won't by itself do much to impact performance in Acero but things for the community to explore in the future are the following (this work that I've been doing has been a precondition to consider this):