Skip to content

Conversation

@alamb
Copy link
Contributor

@alamb alamb commented Dec 10, 2025

Which issue does this PR close?

Rationale for this change

I messed up @tobixdev 's PR / branch by accident in #8948

This PR contains the previous commits that were in that PR

What changes are included in this PR?

Are these changes tested?

Yes, by CI

Are there any user-facing changes?

Less bugs, faster code

@github-actions github-actions bot added the arrow Changes to the arrow crate label Dec 10, 2025
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe one note for this change. As I understand it, some take kernels are using to_usize while others are using as_usize. I've now used as_usize as it is also used in the take_native kernel.

Is there any guidelines on which operation we should use?

Copy link
Contributor

@tobixdev tobixdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to "revive" the old PR but without much success. So before wrangling against GitHub let's just use this one. Thanks for taking the initiative and creating this PR!

@alamb alamb marked this pull request as ready for review December 11, 2025 13:12
@alamb
Copy link
Contributor Author

alamb commented Dec 11, 2025

run benchmark take_kernels

@alamb-ghbot
Copy link

🤖 ./gh_compare_arrow.sh gh_compare_arrow.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing take-fsb-null-indices (885f31e) to 1c90efe diff
BENCH_NAME=take_kernels
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench take_kernels
BENCH_FILTER=
BENCH_BRANCH_NAME=take-fsb-null-indices
Results will be posted here when complete

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

group                                                                     main                                   take-fsb-null-indices
-----                                                                     ----                                   ---------------------
take bool 1024                                                            1.00  1337.3±30.18ns        ? ?/sec    1.00  1338.1±12.91ns        ? ?/sec
take bool 512                                                             1.00   732.2±11.16ns        ? ?/sec    1.00   732.9±10.10ns        ? ?/sec
take bool null indices 1024                                               1.02  1654.5±24.98ns        ? ?/sec    1.00  1614.9±37.41ns        ? ?/sec
take bool null values 1024                                                1.00      2.6±0.09µs        ? ?/sec    1.00      2.6±0.02µs        ? ?/sec
take bool null values null indices 1024                                   1.00      3.2±0.06µs        ? ?/sec    1.02      3.3±0.02µs        ? ?/sec
take check bounds i32 1024                                                1.00  1623.5±17.69ns        ? ?/sec    1.00  1624.0±25.86ns        ? ?/sec
take check bounds i32 512                                                 1.00    808.4±3.34ns        ? ?/sec    1.01    813.4±2.59ns        ? ?/sec
take i32 1024                                                             1.00   710.9±10.84ns        ? ?/sec    1.01    715.3±6.54ns        ? ?/sec
take i32 512                                                              1.00    388.0±3.42ns        ? ?/sec    1.09   424.3±14.23ns        ? ?/sec
take i32 null indices 1024                                                1.00    993.0±6.86ns        ? ?/sec    1.00   995.6±10.40ns        ? ?/sec
take i32 null values 1024                                                 1.01      2.1±0.02µs        ? ?/sec    1.00      2.0±0.02µs        ? ?/sec
take i32 null values null indices 1024                                    1.01      2.6±0.03µs        ? ?/sec    1.00      2.6±0.07µs        ? ?/sec
take primitive fsb value len: 12, indices: 1024                           2.25      8.4±0.07µs        ? ?/sec    1.00      3.7±0.06µs        ? ?/sec
take primitive fsb value len: 12, null values, indices: 1024              1.68      8.4±0.08µs        ? ?/sec    1.00      5.0±0.01µs        ? ?/sec
take primitive run logical len: 1024, physical len: 512, indices: 1024    1.00     20.6±0.13µs        ? ?/sec    1.00     20.5±0.16µs        ? ?/sec
take str 1024                                                             1.00     10.9±0.04µs        ? ?/sec    1.00     10.9±0.12µs        ? ?/sec
take str 512                                                              1.01      5.4±0.04µs        ? ?/sec    1.00      5.3±0.04µs        ? ?/sec
take str null indices 1024                                                1.01      6.9±0.05µs        ? ?/sec    1.00      6.8±0.11µs        ? ?/sec
take str null indices 512                                                 1.04      3.4±0.02µs        ? ?/sec    1.00      3.3±0.04µs        ? ?/sec
take str null values 1024                                                 1.00      8.6±0.13µs        ? ?/sec    1.00      8.6±0.31µs        ? ?/sec
take str null values null indices 1024                                    1.00      6.3±0.13µs        ? ?/sec    1.00      6.3±0.03µs        ? ?/sec
take stringview 1024                                                      1.00   860.4±23.03ns        ? ?/sec    1.00    856.1±3.33ns        ? ?/sec
take stringview 512                                                       1.01    481.8±3.92ns        ? ?/sec    1.00    475.2±4.21ns        ? ?/sec
take stringview null indices 1024                                         1.00  1371.2±19.60ns        ? ?/sec    1.01  1386.5±21.26ns        ? ?/sec
take stringview null indices 512                                          1.00    695.1±5.88ns        ? ?/sec    1.00    693.5±4.96ns        ? ?/sec
take stringview null values 1024                                          1.01      2.1±0.04µs        ? ?/sec    1.00      2.1±0.04µs        ? ?/sec
take stringview null values null indices 1024                             1.01      2.9±0.04µs        ? ?/sec    1.00      2.8±0.03µs        ? ?/sec

@alamb alamb merged commit 9293c89 into apache:main Dec 11, 2025
53 checks passed
@alamb
Copy link
Contributor Author

alamb commented Dec 11, 2025

Thanks again @tobixdev

@alamb alamb deleted the take-fsb-null-indices branch December 11, 2025 19:09
github-merge-queue bot pushed a commit to apache/datafusion that referenced this pull request Jan 14, 2026
#19800)

## Which issue does this PR close?

Adds a reproducer for #19067
and closes #19067.

## Rationale for this change

The bug has been fixed in arrow-rs
(apache/arrow-rs#8981). To ensure this case is
covered in the tests, we add a reproducer.

## What changes are included in this PR?

- SLT test case exhibiting the issue.

DISCLAIMER: First version was generated using AI from the original
reproducer and improved by me. Happy to incorporate further suggestions
for improvements.

## Are these changes tested?

Yes.

 I've also ensure that the test case exhibits the issue on `branch-51`.
Diff when running the test on `branch-51`:
```
[Diff] (-expected|+actual)
    1 aaaaaaaa aaaaaaaa 1000
    2 bbbbbbbb bbbbbbbb 2000
-   3 cccccccc NULL NULL
+   3 cccccccc aaaaaaaa NULL
```

## Are there any user-facing changes?

No
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

take_fixed_size_binary Does Not Consider NULL Indices

3 participants