GH-17211: [C++] Add hash_64 scalar compute function (old) #13487

drin · 2022-06-30T21:49:01Z

We would like to expose hashing functions via new compute functions. This PR would add a scalar compute function, FastHash32, which uses Hashing32::HashMultiColumn to compute a 32-bit hash for each row of the input array (or scalar).

This PR focuses on 32-bit hashing, but I would like to use this PR to help decide if a different design is preferable before adding 64-bit hashing. I also am not sure what should be in the unit test except for code coverage.

Potential design changes:

Using an Options argument.
Using an init function for the kernel (but I don't know the input length until run time).
Changing how hashing is used.

Confusion about unit tests:

I'm not sure how to validate the hash outputs are "as expected". Further, unit tests for the hashing functions don't seem to validate hash outputs.
I'm not sure validating the output data types is necessary.
Code coverage of the various input types seems to be the most important, but I need help to figure out the best way to approach this.

Edit (8/9/22):

This PR has been open awhile but seems to have hit a good level of functionality. This PR does not:

address whether the hash functions are "acceptable" statistically
comprehensively benchmark, or test, the various data types

I think these can be addressed in future improvements:

github-actions · 2022-06-30T21:49:28Z

https://issues.apache.org/jira/browse/ARROW-8991

westonpace · 2022-06-30T21:53:27Z

I'm not sure how to validate the hash outputs are "as expected". Further, unit tests for the hashing functions don't seem to validate hash outputs.

For a hashing function I would expect:

If two values are equal then their hashes are equal
Given a random selection of non-equal values there should be some kind of expected false positive rate (e.g. equal hashes on unequal values). Ideally we would include, as part of this, a benchmark that measures the FPR on random values. You could then take then, pick a safe threshold (e.g. if the benchmark tends to show a 5% FPR then pick 10%) and put that into the unit test (e.g. assert the FPR is less than the safe threshold).

github-actions · 2022-06-30T21:53:52Z

https://issues.apache.org/jira/browse/ARROW-16945

github-actions · 2022-06-30T21:53:55Z

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

westonpace · 2022-06-30T21:54:05Z

@save-buffer would you mind reviewing this when you get a chance?

drin · 2022-06-30T21:54:27Z

I renamed this PR for ARROW-16945, so that I can track just a 32-bit version for now. Not sure if this will confuse some of the automated issue tracking/associations

pitrou · 2022-07-19T09:35:35Z

Given a random selection of non-equal values

I'd add that non-random but likely selections should also show a nice hash distribution. Including:

ranges of consecutive [0, N] numbers
clustered strings (for example a subset of /usr/share/dict/words, or something generated that looks like that)

Also, ideally, the bottom K bits of the hash should also keep those nice statistical properties (so that a hash table implementation can simply use a power-of-two modulo).

The implementations from arrow/util/hashing.h try to satisfy those requirements, btw, so you could reuse them.

pitrou · 2022-07-19T09:35:45Z

Why 32-bit rather than 64-bit?

drin · 2022-07-19T17:49:37Z

Why 32-bit rather than 64-bit?

I think Weston mentioned 32-bit in passing and I just made a semi-arbitrary decision to start there. I plan on following this up with a 64-bit version, though not sure if other things should be addressed before I do so (memory management, etc.).

westonpace · 2022-07-19T22:15:04Z

I'd add that non-random but likely selections should also show a nice hash distribution. Including:
The implementations from arrow/util/hashing.h try to satisfy those requirements, btw, so you could reuse them.

It would be great, by the way, to have benchmarks of the two hashing utilities we have. I believe the utilities we have in key_hash.h trade off some distribution performance in favor of runtime performance when compared with the utilities in hashing.h. It would be nice to have some objective measures of this tradeoff.

Also, can we expand in the function doc that these hashes are not suitable for cryptographic purposes?

pitrou · 2022-07-19T22:49:20Z

There's https://github.com/apache/arrow/blob/master/cpp/src/arrow/util/hashing_benchmark.cc at least

drin · 2022-07-19T23:49:36Z

Thanks @pitrou , I'll start with that as a base and add benchmarks that use the utilities from key_hash.h (I'll see if these also exist elsewhere).

I recently was exposed to hashing.h when looking at the count_distinct function, so I should be able to compare them. Maybe I'll skeleton another compute function, StandardHash (arbitrary, temporary name to distinguish from FastHash), and we can use it to better consider how many hash functions we should have vs what should be knobs available via an Options class.

drin · 2022-07-20T03:27:43Z

rebased

drin · 2022-07-20T18:15:55Z

I wanted to update with initial changes to hashing_benchmark.cc. To make sure comparisons covered at least a few cases, I decided to add FastHash64, although it is currently very copy pasta.

Barring code style/organization improvements, I was hoping to get some feedback on whether these benchmarks are fundamentally what we would like to see. Some questions I have:

how do I ensure I am not including the time to construct an arrow::Array from a vector in the benchmark time?
do we want to time calling the function, or just the mechanisms that the compute function wraps?
- do we want to do both to also capture any overhead in compute kernel invocation?

drin · 2022-07-20T18:18:00Z

Also, I see now that util/hashing.h wraps xxHash, vendored from Cyan4973's github repo. Hashing32 has the following comment above it:

// Implementations are based on xxh3 32-bit algorithm description from:
// https://github.com/Cyan4973/xxHash/blob/dev/doc/xxhash_spec.md

I need to experiment a bit, but I suspect that for int primitives, Hashing32 and Hashing64 just wrap xxHash and add logic for iterating over rows and columns. Specifically, key_hash.cc#L745 looks a lot like hashing.h#L88. hashing_benchmark.cc#L70 benchmarks using 2 multipliers, and I think if it uses just the 1 multiplier then the implementation looks a lot like Hashing64.

All this to say that I think it should be accommodated in the benchmark, but I also wonder if some of the hashing code can be simplified/better organized in case they aren't as independent as they seem at first glance.

cc @michalursa

westonpace · 2022-07-20T19:16:30Z

Here is my understanding, which may not be complete:

util/hashing.h is a more-or-less direct adapter between Arrow and xxhash.

key_hash.cc is based on xxhash but most of the algorithms have been reimplemented to some degree. The key_hash.cc utilities are, in theory, better suited to take advantage of columnar formats and/or vectorized CPUs. However, as part of this rewrite, certain sacrifices were made in hash quality, in favor of performance. This is primarily (only?) used today in Acero for the hash-join and hash-aggregate.

In theory, util/hashing.cc should have a better distribution but worse performance than key_hash.cc.

save-buffer · 2022-07-20T19:46:21Z

Weston is correct, the key_hash interface is only meant for internal use of hash tables. It subtly doesn't match xxh in some cases (iirc it for example can pad with 0's instead of having special handling for arbitrary lengths). The interface is also optimized for hashing in mini batches and reusing as much memory as possible. In general I would suggest using util/hashing instead of key_hash, as key_hash is for internal use; we may want to change it at any time (e.g. we had an idea to see if we can make a really crappy hash that's very fast to compute but is "good enough" for a hash table).

I can leave a review of the current code, but I'd suggest moving away from the key_hash interface overall.

save-buffer

Biggest feedback is to not allocate a large temporary std::vector and instead write directly into the buffer you'll be returning. Also you made your TempVectorStack way too big.

cpp/src/arrow/compute/kernels/scalar_hash.cc

drin · 2022-07-20T21:01:59Z

Biggest feedback is to not allocate a large temporary std::vector and instead write directly into the buffer you'll be returning. Also you made your TempVectorStack way too big.

thanks! I had troubles with smaller sizes and tried to not size it too large, but I'll revisit.

For writing directly into the buffer to return, I assume that means I need to create an ArrayData? is there a better interface via Builders?

drin · 2022-07-20T21:03:11Z

never mind, you actually mentioned how to do it already. the mobile interface hid it

pitrou · 2022-07-21T07:32:06Z

util/hashing.h is a more-or-less direct adapter between Arrow and xxhash.

That's how it ended up working in practice for non-tiny strings, but the basic objective is to have some fast though statistically good hash functions.

ianmcook · 2022-07-21T22:04:41Z

@drin could you please add entries for these functions in the Compute Functions docs (cpp/compute.rst)? Thanks!

drin · 2022-07-22T00:17:17Z

@ursabot please benchmark lang=C++

ursabot · 2022-07-22T00:17:21Z

Benchmark runs are scheduled for baseline = 4d931ff and contender = 3c1fd3b03fa2b143d582244ca1bb93fbc0c84bcf. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Only ['Python'] langs are supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Failed ⬇️0.69% ⬆️0.17%] test-mac-arm
[Skipped ⚠️ Only ['JavaScript', 'Python', 'R'] langs are supported on ursa-i9-9960x] ursa-i9-9960x
[Finished ⬇️0.46% ⬆️0.0%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] 3c1fd3b0 test-mac-arm
[Finished] 3c1fd3b0 ursa-thinkcentre-m75q
[Failed] 4d931ff1 ec2-t3-xlarge-us-east-2
[Failed] 4d931ff1 test-mac-arm
[Failed] 4d931ff1 ursa-i9-9960x
[Finished] 4d931ff1 ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

drin · 2022-07-22T00:22:58Z

before figuring out the conbench stuff, I put some of the local benchmark results in a google slide:

https://docs.google.com/presentation/d/1cUU_F3jB6LsOLbClhl34YdQiodtbz7l76l3juHTsC5k/edit#slide=id.g13e9d117f47_0_63

Just wanted to put this in as a sneak peek even though the test size is small (10,000 items) and this is only for a couple primitives.

drin · 2022-07-22T16:59:29Z

@ursabot please benchmark lang=C++

ursabot · 2022-07-22T16:59:30Z

Commit 3c1fd3b03fa2b143d582244ca1bb93fbc0c84bcf already has scheduled benchmark runs.

…ark (apache#39794) ### Rationale for this change `ScanOnlyBench` use 2 kind of options to scan, but ignore the function `MakeScanNode` will return 4 extra columns. ### What changes are included in this PR? Change the expected result of bench `ScanOnlyBench` ### Are these changes tested? They covered by existing tests. * Closes: apache#39765 Authored-by: xiansen.chen <xiansen.chen@openpie.com> Signed-off-by: Antoine Pitrou <antoine@python.org>

### Rationale for this change Fix a conversion warning that fails compiling arrow-dataset-file-benchmark on MSVC. ### Are these changes tested? Yes, by CI. ### Are there any user-facing changes? No. Authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Sutou Kouhei <kou@clear-code.com>

…ey hash avx2 (apache#39800) ### Rationale for this change Issue apache#39778 seems caused by a careless (but hard to spot) bug in key hash avx2. ### What changes are included in this PR? Fix the careless bug. ### Are these changes tested? UT included. ### Are there any user-facing changes? No. * Closes: apache#39778 Authored-by: Ruoxi Sun <zanmato1984@gmail.com> Signed-off-by: Antoine Pitrou <antoine@python.org>

…to int32 (apache#39528) Be defensive instead of writing invalid data. ### Rationale for this change Users can provide this API pages that are large to validly write and we silently truncate lengths before writing. ### What changes are included in this PR? Add validations and throw an exception if sizes are too large (this was previously checked only if page indexes are being build). ### Are these changes tested? Unit tested ### Are there any user-facing changes? This might start raising exceptions instead of writing out invalid parquet files. Closes apache#39527 **This PR contains a "Critical Fix".** * Closes: apache#39527 Lead-authored-by: emkornfield <emkornfield@gmail.com> Co-authored-by: Micah Kornfield <micahk@google.com> Co-authored-by: mwish <maplewish117@gmail.com> Co-authored-by: Antoine Pitrou <pitrou@free.fr> Co-authored-by: Gang Wu <ustcwg@gmail.com> Signed-off-by: mwish <maplewish117@gmail.com>

…roto for building arrays (apache#39721) This PR updates the ArrowReaderHelper to use an ArrowField object for building an Array instead of a protobuf field obj. This removes leveraging protobuf from building out the Arrays and makes the code easier to reuse (like for the C Data Interface) * Closes: apache#39720 Authored-by: Alva Bandy <abandy@live.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>

…lve() (apache#39817) ### Rationale for this change There has been interest in improving operations on chunked-arrays and even though `ChunkResolver::Resolve()` is not a big contributor in most kernels, the fact that it can be used from tight loops warrants careful attention to branch prediction and memory effects of its implementation. ### What changes are included in this PR? - Documentation of invariants and behavior of functions - Multiple optimizations justified by microbenchmarks - Addition of a variation of `Resolve` that takes a hint as parameter - Fix of an out-of-bounds memory access that doesn't affect correctness (it can only reduce effectiveness of cache in very rare situations, but is nevertheless an issue) ### Are these changes tested? Yes, by existing tests. ### Are there any user-facing changes? - The `arrow::internal::ChunkResolver::Bisect()` function was `protected` and is now `private` with a different signature * Closes: apache#39815 Authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com> Signed-off-by: Felipe Oliveira Carvalho <felipekde@gmail.com>

…ntervals (apache#39795) ### Rationale for this change The filter and take functions were not correctly supported on month_day_nano intervals. ### What changes are included in this PR? * Expand the primitive filter implementation to handle all possible fixed-width primitive types (including fixed-size binary) * Expand the take filter implementation to handle all well-known fixed-width primitive types (including month_day_nano, decimal128 and decimal256) * Add benchmarks for taking and filtering fixed-size binary These changes allow for very significant performance improvements filtering and taking fixed-size binary data: ``` ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Non-regressions: (90) ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- benchmark baseline contender change % counters FilterFixedSizeBinaryFilterNoNulls/524288/0/8 1.716 GiB/sec 33.814 GiB/sec 1870.862 {'family_index': 0, 'per_family_instance_index': 0, 'run_name': 'FilterFixedSizeBinaryFilterNoNulls/524288/0/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 2462, 'byte_width': 8.0, 'data null%': 0.0, 'mask null%': 0.0, 'select%': 99.9} TakeFixedSizeBinaryRandomIndicesWithNulls/524288/1/8 380.056M items/sec 7.098G items/sec 1767.491 {'family_index': 3, 'per_family_instance_index': 6, 'run_name': 'TakeFixedSizeBinaryRandomIndicesWithNulls/524288/1/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 505, 'byte_width': 8.0, 'null_percent': 100.0} FilterFixedSizeBinaryFilterNoNulls/524288/0/9 1.916 GiB/sec 33.721 GiB/sec 1659.766 {'family_index': 0, 'per_family_instance_index': 1, 'run_name': 'FilterFixedSizeBinaryFilterNoNulls/524288/0/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 2750, 'byte_width': 9.0, 'data null%': 0.0, 'mask null%': 0.0, 'select%': 99.9} FilterFixedSizeBinaryFilterNoNulls/524288/9/8 917.713 MiB/sec 9.193 GiB/sec 925.719 {'family_index': 0, 'per_family_instance_index': 18, 'run_name': 'FilterFixedSizeBinaryFilterNoNulls/524288/9/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1271, 'byte_width': 8.0, 'data null%': 10.0, 'mask null%': 0.0, 'select%': 99.9} FilterFixedSizeBinaryFilterNoNulls/524288/12/8 1.004 GiB/sec 9.374 GiB/sec 833.673 {'family_index': 0, 'per_family_instance_index': 24, 'run_name': 'FilterFixedSizeBinaryFilterNoNulls/524288/12/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1440, 'byte_width': 8.0, 'data null%': 90.0, 'mask null%': 0.0, 'select%': 99.9} FilterFixedSizeBinaryFilterNoNulls/524288/3/8 1.625 GiB/sec 15.009 GiB/sec 823.442 {'family_index': 0, 'per_family_instance_index': 6, 'run_name': 'FilterFixedSizeBinaryFilterNoNulls/524288/3/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 2328, 'byte_width': 8.0, 'data null%': 0.1, 'mask null%': 0.0, 'select%': 99.9} FilterFixedSizeBinaryFilterNoNulls/524288/9/9 1021.638 MiB/sec 9.126 GiB/sec 814.670 {'family_index': 0, 'per_family_instance_index': 19, 'run_name': 'FilterFixedSizeBinaryFilterNoNulls/524288/9/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1428, 'byte_width': 9.0, 'data null%': 10.0, 'mask null%': 0.0, 'select%': 99.9} FilterFixedSizeBinaryFilterNoNulls/524288/6/8 1.235 GiB/sec 10.814 GiB/sec 775.869 {'family_index': 0, 'per_family_instance_index': 12, 'run_name': 'FilterFixedSizeBinaryFilterNoNulls/524288/6/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1762, 'byte_width': 8.0, 'data null%': 1.0, 'mask null%': 0.0, 'select%': 99.9} FilterFixedSizeBinaryFilterNoNulls/524288/12/9 1.123 GiB/sec 9.120 GiB/sec 712.196 {'family_index': 0, 'per_family_instance_index': 25, 'run_name': 'FilterFixedSizeBinaryFilterNoNulls/524288/12/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1598, 'byte_width': 9.0, 'data null%': 90.0, 'mask null%': 0.0, 'select%': 99.9} FilterFixedSizeBinaryFilterNoNulls/524288/6/9 1.370 GiB/sec 10.499 GiB/sec 666.348 {'family_index': 0, 'per_family_instance_index': 13, 'run_name': 'FilterFixedSizeBinaryFilterNoNulls/524288/6/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1958, 'byte_width': 9.0, 'data null%': 1.0, 'mask null%': 0.0, 'select%': 99.9} FilterFixedSizeBinaryFilterNoNulls/524288/3/9 1.814 GiB/sec 13.394 GiB/sec 638.343 {'family_index': 0, 'per_family_instance_index': 7, 'run_name': 'FilterFixedSizeBinaryFilterNoNulls/524288/3/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 2600, 'byte_width': 9.0, 'data null%': 0.1, 'mask null%': 0.0, 'select%': 99.9} FilterFixedSizeBinaryFilterNoNulls/524288/2/8 12.155 GiB/sec 77.799 GiB/sec 540.051 {'family_index': 0, 'per_family_instance_index': 4, 'run_name': 'FilterFixedSizeBinaryFilterNoNulls/524288/2/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 17222, 'byte_width': 8.0, 'data null%': 0.0, 'mask null%': 0.0, 'select%': 1.0} FilterFixedSizeBinaryFilterNoNulls/524288/2/9 13.507 GiB/sec 84.361 GiB/sec 524.592 {'family_index': 0, 'per_family_instance_index': 5, 'run_name': 'FilterFixedSizeBinaryFilterNoNulls/524288/2/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 19469, 'byte_width': 9.0, 'data null%': 0.0, 'mask null%': 0.0, 'select%': 1.0} TakeFixedSizeBinaryMonotonicIndices/524288/1/8 194.493M items/sec 732.378M items/sec 276.557 {'family_index': 4, 'per_family_instance_index': 6, 'run_name': 'TakeFixedSizeBinaryMonotonicIndices/524288/1/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 259, 'byte_width': 8.0, 'null_percent': 100.0} TakeFixedSizeBinaryRandomIndicesNoNulls/524288/1/8 200.981M items/sec 747.628M items/sec 271.989 {'family_index': 2, 'per_family_instance_index': 6, 'run_name': 'TakeFixedSizeBinaryRandomIndicesNoNulls/524288/1/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 268, 'byte_width': 8.0, 'null_percent': 100.0} FilterFixedSizeBinaryFilterWithNulls/524288/0/8 947.631 MiB/sec 3.318 GiB/sec 258.565 {'family_index': 1, 'per_family_instance_index': 0, 'run_name': 'FilterFixedSizeBinaryFilterWithNulls/524288/0/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1329, 'byte_width': 8.0, 'data null%': 0.0, 'mask null%': 5.0, 'select%': 99.9} FilterFixedSizeBinaryFilterWithNulls/524288/3/8 911.406 MiB/sec 3.121 GiB/sec 250.677 {'family_index': 1, 'per_family_instance_index': 6, 'run_name': 'FilterFixedSizeBinaryFilterWithNulls/524288/3/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1275, 'byte_width': 8.0, 'data null%': 0.1, 'mask null%': 5.0, 'select%': 99.9} FilterFixedSizeBinaryFilterNoNulls/524288/1/8 1.045 GiB/sec 3.535 GiB/sec 238.406 {'family_index': 0, 'per_family_instance_index': 2, 'run_name': 'FilterFixedSizeBinaryFilterNoNulls/524288/1/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1496, 'byte_width': 8.0, 'data null%': 0.0, 'mask null%': 0.0, 'select%': 50.0} FilterFixedSizeBinaryFilterWithNulls/524288/6/8 899.161 MiB/sec 2.915 GiB/sec 232.029 {'family_index': 1, 'per_family_instance_index': 12, 'run_name': 'FilterFixedSizeBinaryFilterWithNulls/524288/6/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1260, 'byte_width': 8.0, 'data null%': 1.0, 'mask null%': 5.0, 'select%': 99.9} FilterFixedSizeBinaryFilterWithNulls/524288/9/8 829.852 MiB/sec 2.617 GiB/sec 222.914 {'family_index': 1, 'per_family_instance_index': 18, 'run_name': 'FilterFixedSizeBinaryFilterWithNulls/524288/9/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1157, 'byte_width': 8.0, 'data null%': 10.0, 'mask null%': 5.0, 'select%': 99.9} TakeFixedSizeBinaryRandomIndicesNoNulls/524288/0/8 234.268M items/sec 752.809M items/sec 221.345 {'family_index': 2, 'per_family_instance_index': 8, 'run_name': 'TakeFixedSizeBinaryRandomIndicesNoNulls/524288/0/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 312, 'byte_width': 8.0, 'null_percent': 0.0} FilterFixedSizeBinaryFilterNoNulls/524288/1/9 1.171 GiB/sec 3.711 GiB/sec 216.957 {'family_index': 0, 'per_family_instance_index': 3, 'run_name': 'FilterFixedSizeBinaryFilterNoNulls/524288/1/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1674, 'byte_width': 9.0, 'data null%': 0.0, 'mask null%': 0.0, 'select%': 50.0} TakeFixedSizeBinaryMonotonicIndices/524288/0/8 249.393M items/sec 787.274M items/sec 215.676 {'family_index': 4, 'per_family_instance_index': 8, 'run_name': 'TakeFixedSizeBinaryMonotonicIndices/524288/0/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 333, 'byte_width': 8.0, 'null_percent': 0.0} TakeFixedSizeBinaryRandomIndicesWithNulls/524288/0/8 234.268M items/sec 736.727M items/sec 214.481 {'family_index': 3, 'per_family_instance_index': 8, 'run_name': 'TakeFixedSizeBinaryRandomIndicesWithNulls/524288/0/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 313, 'byte_width': 8.0, 'null_percent': 0.0} TakeFixedSizeBinaryMonotonicIndices/524288/1000/8 134.852M items/sec 423.748M items/sec 214.231 {'family_index': 4, 'per_family_instance_index': 0, 'run_name': 'TakeFixedSizeBinaryMonotonicIndices/524288/1000/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 202, 'byte_width': 8.0, 'null_percent': 0.1} FilterFixedSizeBinaryFilterWithNulls/524288/12/8 913.734 MiB/sec 2.599 GiB/sec 191.245 {'family_index': 1, 'per_family_instance_index': 24, 'run_name': 'FilterFixedSizeBinaryFilterWithNulls/524288/12/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1292, 'byte_width': 8.0, 'data null%': 90.0, 'mask null%': 5.0, 'select%': 99.9} TakeFixedSizeBinaryRandomIndicesNoNulls/524288/1000/8 138.218M items/sec 309.307M items/sec 123.783 {'family_index': 2, 'per_family_instance_index': 0, 'run_name': 'TakeFixedSizeBinaryRandomIndicesNoNulls/524288/1000/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 184, 'byte_width': 8.0, 'null_percent': 0.1} TakeFixedSizeBinaryRandomIndicesWithNulls/524288/1000/8 132.755M items/sec 293.027M items/sec 120.727 {'family_index': 3, 'per_family_instance_index': 0, 'run_name': 'TakeFixedSizeBinaryRandomIndicesWithNulls/524288/1000/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 179, 'byte_width': 8.0, 'null_percent': 0.1} TakeFixedSizeBinaryRandomIndicesNoNulls/524288/10/8 125.492M items/sec 272.996M items/sec 117.540 {'family_index': 2, 'per_family_instance_index': 2, 'run_name': 'TakeFixedSizeBinaryRandomIndicesNoNulls/524288/10/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 174, 'byte_width': 8.0, 'null_percent': 10.0} FilterFixedSizeBinaryFilterWithNulls/524288/9/9 926.938 MiB/sec 1.904 GiB/sec 110.379 {'family_index': 1, 'per_family_instance_index': 19, 'run_name': 'FilterFixedSizeBinaryFilterWithNulls/524288/9/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1295, 'byte_width': 9.0, 'data null%': 10.0, 'mask null%': 5.0, 'select%': 99.9} TakeFixedSizeBinaryMonotonicIndices/524288/10/8 158.754M items/sec 331.106M items/sec 108.565 {'family_index': 4, 'per_family_instance_index': 2, 'run_name': 'TakeFixedSizeBinaryMonotonicIndices/524288/10/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 167, 'byte_width': 8.0, 'null_percent': 10.0} FilterFixedSizeBinaryFilterWithNulls/524288/0/9 1.031 GiB/sec 2.129 GiB/sec 106.621 {'family_index': 1, 'per_family_instance_index': 1, 'run_name': 'FilterFixedSizeBinaryFilterWithNulls/524288/0/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1477, 'byte_width': 9.0, 'data null%': 0.0, 'mask null%': 5.0, 'select%': 99.9} FilterFixedSizeBinaryFilterWithNulls/524288/3/9 1020.776 MiB/sec 2.056 GiB/sec 106.293 {'family_index': 1, 'per_family_instance_index': 7, 'run_name': 'FilterFixedSizeBinaryFilterWithNulls/524288/3/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1430, 'byte_width': 9.0, 'data null%': 0.1, 'mask null%': 5.0, 'select%': 99.9} FilterFixedSizeBinaryFilterWithNulls/524288/4/8 890.785 MiB/sec 1.768 GiB/sec 103.293 {'family_index': 1, 'per_family_instance_index': 8, 'run_name': 'FilterFixedSizeBinaryFilterWithNulls/524288/4/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1242, 'byte_width': 8.0, 'data null%': 0.1, 'mask null%': 5.0, 'select%': 50.0} FilterFixedSizeBinaryFilterWithNulls/524288/6/9 1005.839 MiB/sec 1.984 GiB/sec 102.023 {'family_index': 1, 'per_family_instance_index': 13, 'run_name': 'FilterFixedSizeBinaryFilterWithNulls/524288/6/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1407, 'byte_width': 9.0, 'data null%': 1.0, 'mask null%': 5.0, 'select%': 99.9} FilterFixedSizeBinaryFilterWithNulls/524288/1/8 916.810 MiB/sec 1.762 GiB/sec 96.757 {'family_index': 1, 'per_family_instance_index': 2, 'run_name': 'FilterFixedSizeBinaryFilterWithNulls/524288/1/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1270, 'byte_width': 8.0, 'data null%': 0.0, 'mask null%': 5.0, 'select%': 50.0} FilterFixedSizeBinaryFilterWithNulls/524288/7/8 890.211 MiB/sec 1.694 GiB/sec 94.853 {'family_index': 1, 'per_family_instance_index': 14, 'run_name': 'FilterFixedSizeBinaryFilterWithNulls/524288/7/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1235, 'byte_width': 8.0, 'data null%': 1.0, 'mask null%': 5.0, 'select%': 50.0} TakeFixedSizeBinaryRandomIndicesNoNulls/524288/2/8 95.788M items/sec 184.004M items/sec 92.095 {'family_index': 2, 'per_family_instance_index': 4, 'run_name': 'TakeFixedSizeBinaryRandomIndicesNoNulls/524288/2/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 124, 'byte_width': 8.0, 'null_percent': 50.0} FilterFixedSizeBinaryFilterWithNulls/524288/10/8 862.497 MiB/sec 1.616 GiB/sec 91.823 {'family_index': 1, 'per_family_instance_index': 20, 'run_name': 'FilterFixedSizeBinaryFilterWithNulls/524288/10/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1200, 'byte_width': 8.0, 'data null%': 10.0, 'mask null%': 5.0, 'select%': 50.0} FilterFixedSizeBinaryFilterWithNulls/524288/12/9 1.005 GiB/sec 1.904 GiB/sec 89.431 {'family_index': 1, 'per_family_instance_index': 25, 'run_name': 'FilterFixedSizeBinaryFilterWithNulls/524288/12/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1442, 'byte_width': 9.0, 'data null%': 90.0, 'mask null%': 5.0, 'select%': 99.9} TakeFixedSizeBinaryMonotonicIndices/524288/2/8 123.065M items/sec 228.755M items/sec 85.881 {'family_index': 4, 'per_family_instance_index': 4, 'run_name': 'TakeFixedSizeBinaryMonotonicIndices/524288/2/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 164, 'byte_width': 8.0, 'null_percent': 50.0} FilterFixedSizeBinaryFilterNoNulls/524288/10/8 930.637 MiB/sec 1.669 GiB/sec 83.659 {'family_index': 0, 'per_family_instance_index': 20, 'run_name': 'FilterFixedSizeBinaryFilterNoNulls/524288/10/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1293, 'byte_width': 8.0, 'data null%': 10.0, 'mask null%': 0.0, 'select%': 50.0} FilterFixedSizeBinaryFilterNoNulls/524288/4/8 1.034 GiB/sec 1.871 GiB/sec 81.019 {'family_index': 0, 'per_family_instance_index': 8, 'run_name': 'FilterFixedSizeBinaryFilterNoNulls/524288/4/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1482, 'byte_width': 8.0, 'data null%': 0.1, 'mask null%': 0.0, 'select%': 50.0} FilterFixedSizeBinaryFilterNoNulls/524288/7/8 1004.789 MiB/sec 1.772 GiB/sec 80.538 {'family_index': 0, 'per_family_instance_index': 14, 'run_name': 'FilterFixedSizeBinaryFilterNoNulls/524288/7/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1404, 'byte_width': 8.0, 'data null%': 1.0, 'mask null%': 0.0, 'select%': 50.0} FilterFixedSizeBinaryFilterWithNulls/524288/13/8 920.819 MiB/sec 1.616 GiB/sec 79.686 {'family_index': 1, 'per_family_instance_index': 26, 'run_name': 'FilterFixedSizeBinaryFilterWithNulls/524288/13/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1285, 'byte_width': 8.0, 'data null%': 90.0, 'mask null%': 5.0, 'select%': 50.0} FilterFixedSizeBinaryFilterNoNulls/524288/13/8 974.713 MiB/sec 1.669 GiB/sec 75.388 {'family_index': 0, 'per_family_instance_index': 26, 'run_name': 'FilterFixedSizeBinaryFilterNoNulls/524288/13/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1363, 'byte_width': 8.0, 'data null%': 90.0, 'mask null%': 0.0, 'select%': 50.0} TakeFixedSizeBinaryRandomIndicesWithNulls/524288/10/8 107.165M items/sec 187.372M items/sec 74.845 {'family_index': 3, 'per_family_instance_index': 2, 'run_name': 'TakeFixedSizeBinaryRandomIndicesWithNulls/524288/10/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 143, 'byte_width': 8.0, 'null_percent': 10.0} TakeFixedSizeBinaryRandomIndicesWithNulls/524288/2/8 72.662M items/sec 114.781M items/sec 57.965 {'family_index': 3, 'per_family_instance_index': 4, 'run_name': 'TakeFixedSizeBinaryRandomIndicesWithNulls/524288/2/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 96, 'byte_width': 8.0, 'null_percent': 50.0} FilterFixedSizeBinaryFilterWithNulls/524288/10/9 976.180 MiB/sec 1.480 GiB/sec 55.260 {'family_index': 1, 'per_family_instance_index': 21, 'run_name': 'FilterFixedSizeBinaryFilterWithNulls/524288/10/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1358, 'byte_width': 9.0, 'data null%': 10.0, 'mask null%': 5.0, 'select%': 50.0} FilterFixedSizeBinaryFilterNoNulls/524288/10/9 1.023 GiB/sec 1.581 GiB/sec 54.502 {'family_index': 0, 'per_family_instance_index': 21, 'run_name': 'FilterFixedSizeBinaryFilterNoNulls/524288/10/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1466, 'byte_width': 9.0, 'data null%': 10.0, 'mask null%': 0.0, 'select%': 50.0} FilterFixedSizeBinaryFilterWithNulls/524288/4/9 992.477 MiB/sec 1.453 GiB/sec 49.957 {'family_index': 1, 'per_family_instance_index': 9, 'run_name': 'FilterFixedSizeBinaryFilterWithNulls/524288/4/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1400, 'byte_width': 9.0, 'data null%': 0.1, 'mask null%': 5.0, 'select%': 50.0} FilterFixedSizeBinaryFilterWithNulls/524288/7/9 997.679 MiB/sec 1.450 GiB/sec 48.846 {'family_index': 1, 'per_family_instance_index': 15, 'run_name': 'FilterFixedSizeBinaryFilterWithNulls/524288/7/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1389, 'byte_width': 9.0, 'data null%': 1.0, 'mask null%': 5.0, 'select%': 50.0} FilterFixedSizeBinaryFilterNoNulls/524288/13/9 1.071 GiB/sec 1.581 GiB/sec 47.526 {'family_index': 0, 'per_family_instance_index': 27, 'run_name': 'FilterFixedSizeBinaryFilterNoNulls/524288/13/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1538, 'byte_width': 9.0, 'data null%': 90.0, 'mask null%': 0.0, 'select%': 50.0} FilterFixedSizeBinaryFilterWithNulls/524288/13/9 1.008 GiB/sec 1.485 GiB/sec 47.328 {'family_index': 1, 'per_family_instance_index': 27, 'run_name': 'FilterFixedSizeBinaryFilterWithNulls/524288/13/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1446, 'byte_width': 9.0, 'data null%': 90.0, 'mask null%': 5.0, 'select%': 50.0} FilterFixedSizeBinaryFilterWithNulls/524288/1/9 1.003 GiB/sec 1.452 GiB/sec 44.708 {'family_index': 1, 'per_family_instance_index': 3, 'run_name': 'FilterFixedSizeBinaryFilterWithNulls/524288/1/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1437, 'byte_width': 9.0, 'data null%': 0.0, 'mask null%': 5.0, 'select%': 50.0} FilterFixedSizeBinaryFilterNoNulls/524288/7/9 1.105 GiB/sec 1.568 GiB/sec 41.954 {'family_index': 0, 'per_family_instance_index': 15, 'run_name': 'FilterFixedSizeBinaryFilterNoNulls/524288/7/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1587, 'byte_width': 9.0, 'data null%': 1.0, 'mask null%': 0.0, 'select%': 50.0} FilterFixedSizeBinaryFilterNoNulls/524288/4/9 1.163 GiB/sec 1.613 GiB/sec 38.639 {'family_index': 0, 'per_family_instance_index': 9, 'run_name': 'FilterFixedSizeBinaryFilterNoNulls/524288/4/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1662, 'byte_width': 9.0, 'data null%': 0.1, 'mask null%': 0.0, 'select%': 50.0} FilterFixedSizeBinaryFilterWithNulls/524288/14/9 8.884 GiB/sec 12.117 GiB/sec 36.381 {'family_index': 1, 'per_family_instance_index': 29, 'run_name': 'FilterFixedSizeBinaryFilterWithNulls/524288/14/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 12508, 'byte_width': 9.0, 'data null%': 90.0, 'mask null%': 5.0, 'select%': 1.0} FilterFixedSizeBinaryFilterWithNulls/524288/11/9 8.886 GiB/sec 12.075 GiB/sec 35.892 {'family_index': 1, 'per_family_instance_index': 23, 'run_name': 'FilterFixedSizeBinaryFilterWithNulls/524288/11/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 12716, 'byte_width': 9.0, 'data null%': 10.0, 'mask null%': 5.0, 'select%': 1.0} TakeFixedSizeBinaryMonotonicIndices/524288/1000/9 134.765M items/sec 182.868M items/sec 35.694 {'family_index': 4, 'per_family_instance_index': 1, 'run_name': 'TakeFixedSizeBinaryMonotonicIndices/524288/1000/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 206, 'byte_width': 9.0, 'null_percent': 0.1} FilterFixedSizeBinaryFilterNoNulls/524288/5/8 11.393 GiB/sec 15.091 GiB/sec 32.453 {'family_index': 0, 'per_family_instance_index': 10, 'run_name': 'FilterFixedSizeBinaryFilterNoNulls/524288/5/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 16510, 'byte_width': 8.0, 'data null%': 0.1, 'mask null%': 0.0, 'select%': 1.0} FilterFixedSizeBinaryFilterNoNulls/524288/8/8 11.573 GiB/sec 15.102 GiB/sec 30.496 {'family_index': 0, 'per_family_instance_index': 16, 'run_name': 'FilterFixedSizeBinaryFilterNoNulls/524288/8/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 16684, 'byte_width': 8.0, 'data null%': 1.0, 'mask null%': 0.0, 'select%': 1.0} FilterFixedSizeBinaryFilterWithNulls/524288/11/8 7.740 GiB/sec 10.059 GiB/sec 29.956 {'family_index': 1, 'per_family_instance_index': 22, 'run_name': 'FilterFixedSizeBinaryFilterWithNulls/524288/11/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 10972, 'byte_width': 8.0, 'data null%': 10.0, 'mask null%': 5.0, 'select%': 1.0} FilterFixedSizeBinaryFilterWithNulls/524288/14/8 7.733 GiB/sec 9.915 GiB/sec 28.213 {'family_index': 1, 'per_family_instance_index': 28, 'run_name': 'FilterFixedSizeBinaryFilterWithNulls/524288/14/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 10991, 'byte_width': 8.0, 'data null%': 90.0, 'mask null%': 5.0, 'select%': 1.0} FilterFixedSizeBinaryFilterWithNulls/524288/5/8 7.682 GiB/sec 9.765 GiB/sec 27.109 {'family_index': 1, 'per_family_instance_index': 10, 'run_name': 'FilterFixedSizeBinaryFilterWithNulls/524288/5/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 10991, 'byte_width': 8.0, 'data null%': 0.1, 'mask null%': 5.0, 'select%': 1.0} FilterFixedSizeBinaryFilterWithNulls/524288/8/9 8.856 GiB/sec 11.180 GiB/sec 26.241 {'family_index': 1, 'per_family_instance_index': 17, 'run_name': 'FilterFixedSizeBinaryFilterWithNulls/524288/8/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 12571, 'byte_width': 9.0, 'data null%': 1.0, 'mask null%': 5.0, 'select%': 1.0} FilterFixedSizeBinaryFilterWithNulls/524288/8/8 7.735 GiB/sec 9.710 GiB/sec 25.530 {'family_index': 1, 'per_family_instance_index': 16, 'run_name': 'FilterFixedSizeBinaryFilterWithNulls/524288/8/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 11069, 'byte_width': 8.0, 'data null%': 1.0, 'mask null%': 5.0, 'select%': 1.0} TakeFixedSizeBinaryMonotonicIndices/524288/10/9 128.606M items/sec 160.249M items/sec 24.604 {'family_index': 4, 'per_family_instance_index': 3, 'run_name': 'TakeFixedSizeBinaryMonotonicIndices/524288/10/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 209, 'byte_width': 9.0, 'null_percent': 10.0} FilterFixedSizeBinaryFilterNoNulls/524288/11/8 12.033 GiB/sec 14.737 GiB/sec 22.478 {'family_index': 0, 'per_family_instance_index': 22, 'run_name': 'FilterFixedSizeBinaryFilterNoNulls/524288/11/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 17220, 'byte_width': 8.0, 'data null%': 10.0, 'mask null%': 0.0, 'select%': 1.0} FilterFixedSizeBinaryFilterNoNulls/524288/14/8 12.141 GiB/sec 14.761 GiB/sec 21.579 {'family_index': 0, 'per_family_instance_index': 28, 'run_name': 'FilterFixedSizeBinaryFilterNoNulls/524288/14/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 17343, 'byte_width': 8.0, 'data null%': 90.0, 'mask null%': 0.0, 'select%': 1.0} FilterFixedSizeBinaryFilterWithNulls/524288/5/9 8.825 GiB/sec 10.633 GiB/sec 20.489 {'family_index': 1, 'per_family_instance_index': 11, 'run_name': 'FilterFixedSizeBinaryFilterWithNulls/524288/5/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 12543, 'byte_width': 9.0, 'data null%': 0.1, 'mask null%': 5.0, 'select%': 1.0} FilterFixedSizeBinaryFilterWithNulls/524288/2/8 8.300 GiB/sec 9.969 GiB/sec 20.117 {'family_index': 1, 'per_family_instance_index': 4, 'run_name': 'FilterFixedSizeBinaryFilterWithNulls/524288/2/8', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 11819, 'byte_width': 8.0, 'data null%': 0.0, 'mask null%': 5.0, 'select%': 1.0} FilterFixedSizeBinaryFilterNoNulls/524288/5/9 12.954 GiB/sec 15.192 GiB/sec 17.273 {'family_index': 0, 'per_family_instance_index': 11, 'run_name': 'FilterFixedSizeBinaryFilterNoNulls/524288/5/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 18572, 'byte_width': 9.0, 'data null%': 0.1, 'mask null%': 0.0, 'select%': 1.0} FilterFixedSizeBinaryFilterNoNulls/524288/8/9 13.181 GiB/sec 15.222 GiB/sec 15.490 {'family_index': 0, 'per_family_instance_index': 17, 'run_name': 'FilterFixedSizeBinaryFilterNoNulls/524288/8/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 18904, 'byte_width': 9.0, 'data null%': 1.0, 'mask null%': 0.0, 'select%': 1.0} FilterFixedSizeBinaryFilterWithNulls/524288/2/9 9.344 GiB/sec 10.632 GiB/sec 13.784 {'family_index': 1, 'per_family_instance_index': 5, 'run_name': 'FilterFixedSizeBinaryFilterWithNulls/524288/2/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 13291, 'byte_width': 9.0, 'data null%': 0.0, 'mask null%': 5.0, 'select%': 1.0} FilterFixedSizeBinaryFilterNoNulls/524288/11/9 13.566 GiB/sec 14.894 GiB/sec 9.789 {'family_index': 0, 'per_family_instance_index': 23, 'run_name': 'FilterFixedSizeBinaryFilterNoNulls/524288/11/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 19349, 'byte_width': 9.0, 'data null%': 10.0, 'mask null%': 0.0, 'select%': 1.0} FilterFixedSizeBinaryFilterNoNulls/524288/14/9 13.603 GiB/sec 14.863 GiB/sec 9.265 {'family_index': 0, 'per_family_instance_index': 29, 'run_name': 'FilterFixedSizeBinaryFilterNoNulls/524288/14/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 19490, 'byte_width': 9.0, 'data null%': 90.0, 'mask null%': 0.0, 'select%': 1.0} TakeFixedSizeBinaryRandomIndicesNoNulls/524288/10/9 124.390M items/sec 133.566M items/sec 7.377 {'family_index': 2, 'per_family_instance_index': 3, 'run_name': 'TakeFixedSizeBinaryRandomIndicesNoNulls/524288/10/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 164, 'byte_width': 9.0, 'null_percent': 10.0} TakeFixedSizeBinaryMonotonicIndices/524288/2/9 116.792M items/sec 124.182M items/sec 6.328 {'family_index': 4, 'per_family_instance_index': 5, 'run_name': 'TakeFixedSizeBinaryMonotonicIndices/524288/2/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 161, 'byte_width': 9.0, 'null_percent': 50.0} TakeFixedSizeBinaryRandomIndicesNoNulls/524288/1000/9 135.860M items/sec 142.524M items/sec 4.905 {'family_index': 2, 'per_family_instance_index': 1, 'run_name': 'TakeFixedSizeBinaryRandomIndicesNoNulls/524288/1000/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 180, 'byte_width': 9.0, 'null_percent': 0.1} TakeFixedSizeBinaryRandomIndicesWithNulls/524288/1000/9 131.123M items/sec 137.400M items/sec 4.788 {'family_index': 3, 'per_family_instance_index': 1, 'run_name': 'TakeFixedSizeBinaryRandomIndicesWithNulls/524288/1000/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 176, 'byte_width': 9.0, 'null_percent': 0.1} TakeFixedSizeBinaryRandomIndicesNoNulls/524288/0/9 220.634M items/sec 230.872M items/sec 4.640 {'family_index': 2, 'per_family_instance_index': 9, 'run_name': 'TakeFixedSizeBinaryRandomIndicesNoNulls/524288/0/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 295, 'byte_width': 9.0, 'null_percent': 0.0} TakeFixedSizeBinaryRandomIndicesNoNulls/524288/2/9 97.425M items/sec 101.477M items/sec 4.159 {'family_index': 2, 'per_family_instance_index': 5, 'run_name': 'TakeFixedSizeBinaryRandomIndicesNoNulls/524288/2/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 130, 'byte_width': 9.0, 'null_percent': 50.0} TakeFixedSizeBinaryRandomIndicesWithNulls/524288/10/9 104.830M items/sec 108.346M items/sec 3.354 {'family_index': 3, 'per_family_instance_index': 3, 'run_name': 'TakeFixedSizeBinaryRandomIndicesWithNulls/524288/10/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 100, 'byte_width': 9.0, 'null_percent': 10.0} TakeFixedSizeBinaryRandomIndicesWithNulls/524288/1/9 378.858M items/sec 387.322M items/sec 2.234 {'family_index': 3, 'per_family_instance_index': 7, 'run_name': 'TakeFixedSizeBinaryRandomIndicesWithNulls/524288/1/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 506, 'byte_width': 9.0, 'null_percent': 100.0} TakeFixedSizeBinaryRandomIndicesWithNulls/524288/0/9 221.900M items/sec 226.450M items/sec 2.050 {'family_index': 3, 'per_family_instance_index': 9, 'run_name': 'TakeFixedSizeBinaryRandomIndicesWithNulls/524288/0/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 295, 'byte_width': 9.0, 'null_percent': 0.0} TakeFixedSizeBinaryMonotonicIndices/524288/0/9 248.664M items/sec 253.037M items/sec 1.758 {'family_index': 4, 'per_family_instance_index': 9, 'run_name': 'TakeFixedSizeBinaryMonotonicIndices/524288/0/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 332, 'byte_width': 9.0, 'null_percent': 0.0} TakeFixedSizeBinaryRandomIndicesNoNulls/524288/1/9 197.730M items/sec 201.173M items/sec 1.741 {'family_index': 2, 'per_family_instance_index': 7, 'run_name': 'TakeFixedSizeBinaryRandomIndicesNoNulls/524288/1/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 264, 'byte_width': 9.0, 'null_percent': 100.0} TakeFixedSizeBinaryRandomIndicesWithNulls/524288/2/9 73.196M items/sec 74.167M items/sec 1.327 {'family_index': 3, 'per_family_instance_index': 5, 'run_name': 'TakeFixedSizeBinaryRandomIndicesWithNulls/524288/2/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 96, 'byte_width': 9.0, 'null_percent': 50.0} TakeFixedSizeBinaryMonotonicIndices/524288/1/9 192.545M items/sec 188.138M items/sec -2.289 {'family_index': 4, 'per_family_instance_index': 7, 'run_name': 'TakeFixedSizeBinaryMonotonicIndices/524288/1/9', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 257, 'byte_width': 9.0, 'null_percent': 100.0} ``` ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * Closes: apache#39740 Authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>

This commit ports the latest state of scalar_hash kernels without pulling a long development history. This kernel is an element-wise function that uses an xxHash-like algorithm, prioritizing speed and not suitable for cryptographic purposes. The function is implemented by the `FastHashScalar` struct which is templated by the output type (which is assumed to be either UInt32 or UInt64, but there is no validation of that at the moment). The benchmarks in scalar_hash_benchmark.cc uses the hashing_benchmark.cc file as a reference (in cpp/src/arrow/util/), but only covers various input types and the key hashing functions (from key_hash.h). The tests in scalar_hash_test.cc use a simplified version of hashing based on what is implemented in the key_hash.cc. The idea being that the high-level entry points for high-level types should eventually reach an expected application of the low-level hash functions on simple data types; the tests do this exact comparison. At the moment, the tests pass for simple cases, but they do not work for nested types with non-trivial row layouts (e.g. ListTypes). Issue: ARROW-8991 Issue: apacheGH-17211

This commit pulls the latest changes to key_hash.h and implementations in light_array without the burden of a long development history. The only change in key_hash.h is the addition of a friend function which is used in scalar_hash_test.cc. Changes in light_array.[h,cc] are to accommodate two scenarios: (1) the use of ArraySpan, which was introduced after light_array was written; and (2) the need for a KeyColumnArray to allocate data for the purposes of interpreting (or decoding) the structure of a nested type. The main reason for the 2nd scenario is that a ListArray may have many values represented in a single row which should be hashed together; however, if the ListArray has a nested ListArray or other type, the row may have further structure. In the simplest interpretation, only the highest-level structure (the "outer" ListArray) needs to be preserved, and any further nested structures must be explicitly handled by custom kernels (or any future options, etc. that are upstreamed into Arrow). In trying to efficiently interpret complex nested types, ArraySpan can be useful because it is non-owning, thus the main reason for the 1st aforementioned scenario. Although unfinished, any tests added to light_array_test.cc should accommodate the 2 scenarios above.

This commit includes changes to register a new compute function without the burden of a long development history. The change to cpp/src/arrow/CMakeLists.txt includes scalar_hash.cc in compilation as it is used by the new Hash64 function defined in api_scalar.[h,cc]. The change to cpp/src/arrow/compute/kernels/CMakeLists.txt includes scalar_hash_test.cc in compilation for tests and it also adds a new benchmark binary that is implemented by scalar_hash_benchmark.cc. The registry files are updated to register the kernel implementations in scalar_hash.cc with the function definitions in api_scalar.[h,cc]. Finally, docs/source/cpp/compute.rst adds documentation for the Hash64 function. Issue: apacheGH-17211 Issue: ARROW-8991

This commit includes additions to the general hashing benchmarks that cover the use of hashing functions in key_hash.h without carrying the burden of a long dev history. Some existing benchmark names were changed to distinguish between the use of Int32 and Int64 types, new benchmarks were added that use the functions declared in key_hash.h. The reason the new benchmarks are added is because it is claimed they prioritize speed over cryptography as they're primarily used for join algorithms and other processing tasks, which the hashing benchmark can now provide observability for. Issue: apacheGH-17211 Issue: ARROW-8991

… ARROW-8991-newfn-scalar-hash-fresh using a merge commit to overwrite commit history while maintaining current state of code

drin · 2024-01-29T16:52:09Z

alright, I attempted to force change of commit history by using
git merge -s ort -Xtheirs ARROW-8991-newfn-scalar-hash-fresh

While this seems to be the "correct way", clearly it makes the PR look gratuitous. I'm going to close this one, and refer to it from a new PR. I'll try to update everything as appropriate.

github-actions · 2024-01-29T16:59:08Z

⚠️ GitHub issue #17211 has been automatically assigned in GitHub to PR creator.

github-actions bot added the Component: C++ label Jun 30, 2022

drin changed the title ~~ARROW-8991: [C++] Add new scalar compute function for hashing~~ ARROW-16945: [C++] Add new scalar compute function for 32-bit hashing Jun 30, 2022

drin force-pushed the ARROW-8991-newfn-scalar-hash branch from 6af47cc to 7787bca Compare July 20, 2022 03:27

save-buffer suggested changes Jul 20, 2022

View reviewed changes

Cerdore and others added 12 commits January 25, 2024 17:11

Merge remote-tracking branch 'drin/ARROW-8991-newfn-scalar-hash' into…

d7924a7

… ARROW-8991-newfn-scalar-hash-fresh using a merge commit to overwrite commit history while maintaining current state of code

github-actions bot added Component: R Component: Java Component: Parquet Component: Go Component: Python Component: JavaScript Component: C# Component: Ruby Component: FlightRPC Component: Gandiva Component: GLib Component: MATLAB Component: Swift labels Jan 29, 2024

drin closed this Jan 29, 2024

drin changed the title ~~GH-17211: [C++] Add hash_64 scalar compute function~~ GH-17211: [C++] Add hash_64 scalar compute function (old) Jan 29, 2024

drin mentioned this pull request Jan 29, 2024

GH-17211: [C++] Add hash_64 scalar compute function #39836

Closed

GH-17211: [C++] Add hash_64 scalar compute function (old) #13487

GH-17211: [C++] Add hash_64 scalar compute function (old) #13487

Uh oh!

Conversation

drin commented Jun 30, 2022 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jun 30, 2022

Uh oh!

westonpace commented Jun 30, 2022

Uh oh!

github-actions bot commented Jun 30, 2022

Uh oh!

github-actions bot commented Jun 30, 2022

Uh oh!

westonpace commented Jun 30, 2022

Uh oh!

drin commented Jun 30, 2022

Uh oh!

pitrou commented Jul 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pitrou commented Jul 19, 2022

Uh oh!

drin commented Jul 19, 2022

Uh oh!

westonpace commented Jul 19, 2022

Uh oh!

pitrou commented Jul 19, 2022

Uh oh!

drin commented Jul 19, 2022

Uh oh!

drin commented Jul 20, 2022

Uh oh!

drin commented Jul 20, 2022

Uh oh!

drin commented Jul 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

westonpace commented Jul 20, 2022

Uh oh!

save-buffer commented Jul 20, 2022

Uh oh!

save-buffer left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

drin commented Jul 20, 2022

Uh oh!

drin commented Jul 20, 2022

Uh oh!

pitrou commented Jul 21, 2022

Uh oh!

ianmcook commented Jul 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

drin commented Jul 22, 2022

Uh oh!

ursabot commented Jul 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

drin commented Jul 22, 2022

Uh oh!

drin commented Jul 22, 2022

Uh oh!

ursabot commented Jul 22, 2022

Uh oh!

drin commented Jan 29, 2024

Uh oh!

github-actions bot commented Jan 29, 2024

Uh oh!

Reviewers

Assignees

drin commented Jun 30, 2022 •

edited by github-actions bot

Loading

pitrou commented Jul 19, 2022 •

edited

Loading

drin commented Jul 20, 2022 •

edited

Loading

ianmcook commented Jul 21, 2022 •

edited

Loading

ursabot commented Jul 22, 2022 •

edited

Loading