GH-39398: [C++][Parquet] DNM: benchmark for readLevels#39486
GH-39398: [C++][Parquet] DNM: benchmark for readLevels#39486mapleFU wants to merge 1 commit intoapache:mainfrom
Conversation
|
Please not merge this patch, this is just for benchmark |
| for (auto _ : state) { | ||
| state.PauseTiming(); | ||
| Int32Reader* reader = helper.ResetColumnReader(); | ||
| [[maybe_unused]] bool v = reader->HasNext(); |
There was a problem hiding this comment.
Using hasNext to trigger initialization
| const auto repetition = static_cast<Repetition::type>(state.range(0)); | ||
| const auto batch_size = static_cast<int64_t>(state.range(1)); | ||
|
|
||
| BenchmarkHelper helper(repetition, /*num_pages=*/1, /*levels_per_page=*/16 * 80000); |
There was a problem hiding this comment.
Using one page to make it simple
| int64_t* indices_read, const T** dict, | ||
| int32_t* dict_len) = 0; | ||
|
|
||
| virtual void ReadLevels(int64_t batch_size, int16_t* def_levels, int16_t* rep_levels, |
There was a problem hiding this comment.
What do you think about making this private with a Test Peer that can be used in the benchmark, so we can check this PR in?
|
Thank you @mapleFU it would be great to check this in (without the count changes) so we have a good benchmark for ReadLevels. See comment on how to maybe do it without breaking abstractions. |
|
I guess looking at CI it would take a little more work than this proof of concept to check it in. |
|
It's just a quick poc for ReadLevels optimization... I think exporting it is so hacking, because |
|
@ursabot please benchmark |
|
Benchmark runs are scheduled for commit 0529cce. Watch https://buildkite.com/apache-arrow and https://conbench.ursa.dev for updates. A comment will be posted here when the runs are complete. |
|
Thanks for your patience. Conbench analyzed the 6 benchmarking runs that have been run so far on PR commit 0529cce. There were 3 benchmark results indicating a performance regression:
The full Conbench report has more details. |
|
Emm would regression benchmark un-related..? |
None of them are related to Parquet. |
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?