GH-40207: [C++] TakeCC: Concatenate only once and delegate to TakeAA instead of TakeCA #40206

felipecrv · 2024-02-23T03:14:49Z

Rationale for this change

take concatenates chunks when it's applied to a chunked values array, but when the indices arrays is also chunked it concatenates values more than once -- one Concatenate call with values.chunks() for every chunk in indices. This PR doesn't remove the concatenation, but ensures it's done only once instead of indices.size() times.

What changes are included in this PR?

Adding return type to the TakeXX names (-> TakeXXY) to makes code easier to understand
Adding benchmarks to TakeCCC — copied from ARROW-9773: [C++] Implement Take kernel for ChunkedArray #13857
Remove the concatenation from the loop body (!)

Are these changes tested?

By existing tests.

Are there any user-facing changes?

A faster compute kernel.

GitHub Issue: [C++] TakeCC is doing indices.num_chunks() Concatenate() calls when it could be doing only one #40207

github-actions · 2024-02-23T03:17:06Z

⚠️ GitHub issue #40207 has been automatically assigned in GitHub to PR creator.

felipecrv · 2024-02-23T03:17:58Z

Items per second improved a lot. Higher is better.

EDIT: these numbers come from the initial version of the benchmarks (10 chunks per array and a 10x bigger indices array relative to values).

felipecrv · 2024-02-24T16:27:07Z

Making the only difference between the benchmarks, the fact that the array is chunked (in a 100 chunks) makes the differences more evident.

js8544 · 2024-02-27T03:32:44Z

+1. Thanks!

felipecrv · 2024-02-27T13:25:44Z

@js8544 thank you for the review. I pushed a revert for the last commit because it isn't necessary for this PR and might conflict with a PR someone else is working on.

If not one objects, I will probably merge this right before I'm ready to send more Take-related PRs.

cpp/src/arrow/compute/kernels/vector_selection_benchmark.cc

pitrou · 2024-02-27T13:32:08Z

cpp/src/arrow/compute/kernels/vector_selection_benchmark.cc

Why not add at least "ChunkedChunked" variations for each of these types as well? (FSB, FSL, String). I'm assuming the performance characteristics might be different?

("ChunkedFlat" is less interesting arguably)

pitrou · 2024-02-27T13:46:08Z

@felipecrv Would you like to rebase if not already done?

felipecrv · 2024-02-27T18:51:13Z

@felipecrv Would you like to rebase if not already done?

Soon. I'm currently working on top of this branch locally 😅

…eCAC ...which would concatenate values on every loop iteration.

- Use args.size correctly (no division by sizeof(int64_t) - More chunks: 100 instead of just 10 - Keep the number of indices in the chunked version equal to the number of items (just like the non-chunked benchmarks) - Variations without chunking of the indices

conbench-apache-arrow · 2024-02-28T19:06:35Z

After merging your PR, Conbench analyzed the 7 benchmarking runs that have been run so far on merge-commit 2c13a19.

There were 2 benchmark results with an error:

Commit Run on ursa-i9-9960x at 2024-02-28 12:08:00Z
- tpch (R) with engine=arrow, format=native, language=R, memory_map=False, query_id=TPCH-12, scale_factor=10
- tpch (R) with engine=arrow, format=native, language=R, memory_map=False, query_id=TPCH-05, scale_factor=10

There were 13 benchmark results indicating a performance regression:

Commit Run on ursa-i9-9960x at 2024-02-28 12:08:00Z
- tpch (R) with engine=arrow, format=native, language=R, memory_map=False, query_id=TPCH-20, scale_factor=1
- tpch (R) with engine=arrow, format=parquet, language=R, memory_map=False, query_id=TPCH-01, scale_factor=1
and 11 more (see the report linked below)

The full Conbench report has more details. It also includes information about 5 possible false positives for unstable benchmarks that are known to sometimes produce them.

github-actions bot added Component: C++ awaiting review Awaiting review labels Feb 23, 2024

felipecrv changed the title ~~Take ccc~~ GH-40207: [C++] TakeCC: Concatenate only once and delegate to TakeAA instead of TakeCA Feb 23, 2024

apache deleted a comment from github-actions bot Feb 23, 2024

felipecrv requested review from js8544 and pitrou February 23, 2024 15:39

js8544 approved these changes Feb 27, 2024

View reviewed changes

github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Feb 27, 2024

pitrou reviewed Feb 27, 2024

View reviewed changes

cpp/src/arrow/compute/kernels/vector_selection_benchmark.cc Outdated Show resolved Hide resolved

pitrou reviewed Feb 27, 2024

View reviewed changes

cpp/src/arrow/compute/kernels/vector_selection_benchmark.cc Outdated Show resolved Hide resolved

pitrou reviewed Feb 27, 2024

View reviewed changes

cpp/src/arrow/compute/kernels/vector_selection_benchmark.cc Outdated Show resolved Hide resolved

pitrou reviewed Feb 27, 2024

View reviewed changes

This comment was marked as outdated.

Sign in to view

felipecrv added 5 commits February 27, 2024 18:45

Also encode the return type of Take*() specializations

c7ca222

TakeChunked benchmarks from apache#13857

367ff9c

TakeCCC: Concatenate only once and delegate to TakeAAA instead of Tak…

a9a8304

…eCAC ...which would concatenate values on every loop iteration.

benchmarks: Rename TakeChunked to TakeChunkedChunked

140e255

felipecrv force-pushed the take_ccc branch from dda9d78 to 40d585b Compare February 27, 2024 21:46

felipecrv added 3 commits February 27, 2024 19:06

Fix benchmarks after code review

954919c

Consistently order the take benchmarks definitions and registrations

2486e99

Add more benchmarks for Take()

8643da4

pitrou approved these changes Feb 28, 2024

View reviewed changes

pitrou merged commit 2c13a19 into apache:main Feb 28, 2024

pitrou removed the awaiting committer review Awaiting committer review label Feb 28, 2024

pitrou mentioned this pull request Feb 28, 2024

[C++] TakeCC is doing indices.num_chunks() Concatenate() calls when it could be doing only one #40207

Closed

github-actions bot added the awaiting committer review Awaiting committer review label Feb 28, 2024

felipecrv deleted the take_ccc branch February 28, 2024 12:50

pitrou mentioned this pull request Feb 28, 2024

GH-39565: [C++] Do not concatenate ChunkedArray when running take function #39566

Closed

GH-40207: [C++] TakeCC: Concatenate only once and delegate to TakeAA instead of TakeCA #40206

GH-40207: [C++] TakeCC: Concatenate only once and delegate to TakeAA instead of TakeCA #40206

Uh oh!

Conversation

felipecrv commented Feb 23, 2024 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

github-actions bot commented Feb 23, 2024

Uh oh!

felipecrv commented Feb 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

felipecrv commented Feb 24, 2024

Uh oh!

js8544 commented Feb 27, 2024

Uh oh!

felipecrv commented Feb 27, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pitrou Feb 27, 2024

Choose a reason for hiding this comment

Uh oh!

pitrou Feb 27, 2024

Choose a reason for hiding this comment

Uh oh!

This comment was marked as outdated.

pitrou commented Feb 27, 2024

Uh oh!

felipecrv commented Feb 27, 2024

Uh oh!

conbench-apache-arrow bot commented Feb 28, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

felipecrv commented Feb 23, 2024 •

edited by github-actions bot

Loading

felipecrv commented Feb 23, 2024 •

edited

Loading