-
Notifications
You must be signed in to change notification settings - Fork 4k
GH-40207: [C++] TakeCC: Concatenate only once and delegate to TakeAA instead of TakeCA #40206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
|
+1. Thanks! |
|
@js8544 thank you for the review. I pushed a revert for the last commit because it isn't necessary for this PR and might conflict with a PR someone else is working on. If not one objects, I will probably merge this right before I'm ready to send more Take-related PRs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not add at least "ChunkedChunked" variations for each of these types as well? (FSB, FSL, String). I'm assuming the performance characteristics might be different?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
("ChunkedFlat" is less interesting arguably)
This comment was marked as outdated.
This comment was marked as outdated.
|
@felipecrv Would you like to rebase if not already done? |
Soon. I'm currently working on top of this branch locally 😅 |
…eCAC ...which would concatenate values on every loop iteration.
- Use args.size correctly (no division by sizeof(int64_t) - More chunks: 100 instead of just 10 - Keep the number of indices in the chunked version equal to the number of items (just like the non-chunked benchmarks) - Variations without chunking of the indices
|
After merging your PR, Conbench analyzed the 7 benchmarking runs that have been run so far on merge-commit 2c13a19. There were 2 benchmark results with an error:
There were 13 benchmark results indicating a performance regression:
The full Conbench report has more details. It also includes information about 5 possible false positives for unstable benchmarks that are known to sometimes produce them. |


Rationale for this change
takeconcatenates chunks when it's applied to a chunkedvaluesarray, but when theindicesarrays is alsochunkedit concatenatesvaluesmore than once -- oneConcatenatecall withvalues.chunks()for every chunk inindices. This PR doesn't remove the concatenation, but ensures it's done only once instead ofindices.size()times.What changes are included in this PR?
TakeXXnames (->TakeXXY) to makes code easier to understandTakeCCC— copied from ARROW-9773: [C++] Implement Take kernel for ChunkedArray #13857Are these changes tested?
By existing tests.
Are there any user-facing changes?
A faster compute kernel.