Skip to content

[C++][Acero] ASAN reports heap buffer overflow in arrow::compute::Hashing32::ProcessStripes_avx2 #39778

@zanmato1984

Description

@zanmato1984

Describe the bug, including details regarding any error messages, version, and platform.

Similar to #39577, only that this can be only observed on Intel chips. As I believe the bug lies in avx2 code path.

Hardware

Intel i9

OS

macOS Sonoma 14.2.1 (23C71)

Version

3fe598a

Reproduce

Change test HashJoin.Random code to run more times, e.g. 1000:

const int num_tests = 25;

Build with ASAN enabled and all allocators disabled:

cmake --preset ninja-debug -DARROW_USE_ASAN=ON -DARROW_JEMALLOC=OFF -DARROW_MIMALLOC=OFF ..
ninja -j8

Run specific test:

./debug/arrow-acero-hash-join-node-test --gtest_filter=HashJoin.Random

Result:

Note: Google Test filter = HashJoin.Random
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from HashJoin
[ RUN      ] HashJoin.Random
=================================================================
==85786==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x621000067a44 at pc 0x000116b70347 bp 0x7ff7bbf3b990 sp 0x7ff7bbf3b988
READ of size 16 at 0x621000067a44 thread T0
    #0 0x116b70346 in long long vector[4] arrow::compute::Hashing32::ProcessStripes_avx2<true>(long long, long long, long long vector[4], unsigned char const*, long long, long long) key_hash_avx2.cc:158
    #1 0x116b6d9ba in unsigned int arrow::compute::Hashing32::HashFixedLenImp_avx2<false>(unsigned int, unsigned long long, unsigned char const*, unsigned int*, unsigned int*) key_hash_avx2.cc:205
    #2 0x116b6d604 in arrow::compute::Hashing32::HashFixedLen_avx2(bool, unsigned int, unsigned long long, unsigned char const*, unsigned int*, unsigned int*) key_hash_avx2.cc:232
    #3 0x1163d28cd in arrow::compute::Hashing32::HashFixed(long long, bool, unsigned int, unsigned long long, unsigned char const*, unsigned int*, unsigned int*) key_hash.cc:366
    #4 0x1163d52e7 in arrow::compute::Hashing32::HashMultiColumn(std::__1::vector<arrow::compute::KeyColumnArray, std::__1::allocator<arrow::compute::KeyColumnArray>> const&, arrow::compute::LightContext*, unsigned int*) key_hash.cc:431
    #5 0x1163d75d9 in arrow::compute::Hashing32::HashBatch(arrow::compute::ExecBatch const&, unsigned int*, std::__1::vector<arrow::compute::KeyColumnArray, std::__1::allocator<arrow::compute::KeyColumnArray>>&, long long, arrow::util::TempVectorStack*, long long, long long) key_hash.cc:472
    #6 0x10774440d in arrow::acero::BloomFilterPushdownContext::BuildBloomFilter_exec_task(unsigned long, long long) hash_join_node.cc:1141
    #7 0x1077e81e7 in arrow::acero::BloomFilterPushdownContext::Init(arrow::acero::HashJoinNode*, unsigned long, std::__1::function<int (std::__1::function<arrow::Status (unsigned long, long long)>, std::__1::function<arrow::Status (unsigned long)>)>, std::__1::function<arrow::Status (int, long long)>, std::__1::function<arrow::Status (unsigned long)>, bool, bool)::$_1::operator()(unsigned long, long long) const hash_join_node.cc:1064
    #8 0x1077e8167 in decltype(std::declval<arrow::acero::BloomFilterPushdownContext::Init(arrow::acero::HashJoinNode*, unsigned long, std::__1::function<int (std::__1::function<arrow::Status (unsigned long, long long)>, std::__1::function<arrow::Status (unsigned long)>)>, std::__1::function<arrow::Status (int, long long)>, std::__1::function<arrow::Status (unsigned long)>, bool, bool)::$_1&>()(std::declval<unsigned long>(), std::declval<long long>())) std::__1::__invoke[abi:v160006]<arrow::acero::BloomFilterPushdownContext::Init(arrow::acero::HashJoinNode*, unsigned long, std::__1::function<int (std::__1::function<arrow::Status (unsigned long, long long)>, std::__1::function<arrow::Status (unsigned long)>)>, std::__1::function<arrow::Status (int, long long)>, std::__1::function<arrow::Status (unsigned long)>, bool, bool)::$_1&, unsigned long, long long>(arrow::acero::BloomFilterPushdownContext::Init(arrow::acero::HashJoinNode*, unsigned long, std::__1::function<int (std::__1::function<arrow::Status (unsigned long, long long)>, std::__1::function<arrow::Status (unsigned long)>)>, std::__1::function<arrow::Status (int, long long)>, std::__1::function<arrow::Status (unsigned long)>, bool, bool)::$_1&, unsigned long&&, long long&&) invoke.h:394
    #9 0x1077e808f in arrow::Status std::__1::__invoke_void_return_wrapper<arrow::Status, false>::__call<arrow::acero::BloomFilterPushdownContext::Init(arrow::acero::HashJoinNode*, unsigned long, std::__1::function<int (std::__1::function<arrow::Status (unsigned long, long long)>, std::__1::function<arrow::Status (unsigned long)>)>, std::__1::function<arrow::Status (int, long long)>, std::__1::function<arrow::Status (unsigned long)>, bool, bool)::$_1&, unsigned long, long long>(arrow::acero::BloomFilterPushdownContext::Init(arrow::acero::HashJoinNode*, unsigned long, std::__1::function<int (std::__1::function<arrow::Status (unsigned long, long long)>, std::__1::function<arrow::Status (unsigned long)>)>, std::__1::function<arrow::Status (int, long long)>, std::__1::function<arrow::Status (unsigned long)>, bool, bool)::$_1&, unsigned long&&, long long&&) invoke.h:478
    #10 0x1077e804f in std::__1::__function::__alloc_func<arrow::acero::BloomFilterPushdownContext::Init(arrow::acero::HashJoinNode*, unsigned long, std::__1::function<int (std::__1::function<arrow::Status (unsigned long, long long)>, std::__1::function<arrow::Status (unsigned long)>)>, std::__1::function<arrow::Status (int, long long)>, std::__1::function<arrow::Status (unsigned long)>, bool, bool)::$_1, std::__1::allocator<arrow::acero::BloomFilterPushdownContext::Init(arrow::acero::HashJoinNode*, unsigned long, std::__1::function<int (std::__1::function<arrow::Status (unsigned long, long long)>, std::__1::function<arrow::Status (unsigned long)>)>, std::__1::function<arrow::Status (int, long long)>, std::__1::function<arrow::Status (unsigned long)>, bool, bool)::$_1>, arrow::Status (unsigned long, long long)>::operator()[abi:v160006](unsigned long&&, long long&&) function.h:185
    #11 0x1077e4b63 in std::__1::__function::__func<arrow::acero::BloomFilterPushdownContext::Init(arrow::acero::HashJoinNode*, unsigned long, std::__1::function<int (std::__1::function<arrow::Status (unsigned long, long long)>, std::__1::function<arrow::Status (unsigned long)>)>, std::__1::function<arrow::Status (int, long long)>, std::__1::function<arrow::Status (unsigned long)>, bool, bool)::$_1, std::__1::allocator<arrow::acero::BloomFilterPushdownContext::Init(arrow::acero::HashJoinNode*, unsigned long, std::__1::function<int (std::__1::function<arrow::Status (unsigned long, long long)>, std::__1::function<arrow::Status (unsigned long)>)>, std::__1::function<arrow::Status (int, long long)>, std::__1::function<arrow::Status (unsigned long)>, bool, bool)::$_1>, arrow::Status (unsigned long, long long)>::operator()(unsigned long&&, long long&&) function.h:356
    #12 0x1079f757a in std::__1::__function::__value_func<arrow::Status (unsigned long, long long)>::operator()[abi:v160006](unsigned long&&, long long&&) const function.h:510
    #13 0x1079e8f8a in std::__1::function<arrow::Status (unsigned long, long long)>::operator()(unsigned long, long long) const function.h:1156
    #14 0x1079e8b37 in arrow::acero::TaskSchedulerImpl::ExecuteTask(unsigned long, int, long long, bool*) task_util.cc:216
    #15 0x1079fd6c7 in arrow::acero::TaskSchedulerImpl::ScheduleMore(unsigned long, int)::$_0::operator()(unsigned long) const task_util.cc:371
    #16 0x1079fd297 in decltype(std::declval<arrow::acero::TaskSchedulerImpl::ScheduleMore(unsigned long, int)::$_0&>()(std::declval<unsigned long>())) std::__1::__invoke[abi:v160006]<arrow::acero::TaskSchedulerImpl::ScheduleMore(unsigned long, int)::$_0&, unsigned long>(arrow::acero::TaskSchedulerImpl::ScheduleMore(unsigned long, int)::$_0&, unsigned long&&) invoke.h:394
    #17 0x1079fd1f7 in arrow::Status std::__1::__invoke_void_return_wrapper<arrow::Status, false>::__call<arrow::acero::TaskSchedulerImpl::ScheduleMore(unsigned long, int)::$_0&, unsigned long>(arrow::acero::TaskSchedulerImpl::ScheduleMore(unsigned long, int)::$_0&, unsigned long&&) invoke.h:478
    #18 0x1079fd1b7 in std::__1::__function::__alloc_func<arrow::acero::TaskSchedulerImpl::ScheduleMore(unsigned long, int)::$_0, std::__1::allocator<arrow::acero::TaskSchedulerImpl::ScheduleMore(unsigned long, int)::$_0>, arrow::Status (unsigned long)>::operator()[abi:v160006](unsigned long&&) function.h:185
    #19 0x1079f9ccb in std::__1::__function::__func<arrow::acero::TaskSchedulerImpl::ScheduleMore(unsigned long, int)::$_0, std::__1::allocator<arrow::acero::TaskSchedulerImpl::ScheduleMore(unsigned long, int)::$_0>, arrow::Status (unsigned long)>::operator()(unsigned long&&) function.h:356
    #20 0x1042bed2a in std::__1::__function::__value_func<arrow::Status (unsigned long)>::operator()[abi:v160006](unsigned long&&) const function.h:510
    #21 0x1042be425 in std::__1::function<arrow::Status (unsigned long)>::operator()(unsigned long) const function.h:1156
    #22 0x10785f17c in arrow::acero::QueryContext::ScheduleTask(std::__1::function<arrow::Status (unsigned long)>, std::__1::basic_string_view<char, std::__1::char_traits<char>>)::$_2::operator()() const query_context.cc:82
    #23 0x10785f0ff in decltype(std::declval<arrow::acero::QueryContext::ScheduleTask(std::__1::function<arrow::Status (unsigned long)>, std::__1::basic_string_view<char, std::__1::char_traits<char>>)::$_2&>()()) std::__1::__invoke[abi:v160006]<arrow::acero::QueryContext::ScheduleTask(std::__1::function<arrow::Status (unsigned long)>, std::__1::basic_string_view<char, std::__1::char_traits<char>>)::$_2&>(arrow::acero::QueryContext::ScheduleTask(std::__1::function<arrow::Status (unsigned long)>, std::__1::basic_string_view<char, std::__1::char_traits<char>>)::$_2&) invoke.h:394
    #24 0x10785f0af in arrow::Status std::__1::__invoke_void_return_wrapper<arrow::Status, false>::__call<arrow::acero::QueryContext::ScheduleTask(std::__1::function<arrow::Status (unsigned long)>, std::__1::basic_string_view<char, std::__1::char_traits<char>>)::$_2&>(arrow::acero::QueryContext::ScheduleTask(std::__1::function<arrow::Status (unsigned long)>, std::__1::basic_string_view<char, std::__1::char_traits<char>>)::$_2&) invoke.h:478
    #25 0x10785f07f in std::__1::__function::__alloc_func<arrow::acero::QueryContext::ScheduleTask(std::__1::function<arrow::Status (unsigned long)>, std::__1::basic_string_view<char, std::__1::char_traits<char>>)::$_2, std::__1::allocator<arrow::acero::QueryContext::ScheduleTask(std::__1::function<arrow::Status (unsigned long)>, std::__1::basic_string_view<char, std::__1::char_traits<char>>)::$_2>, arrow::Status ()>::operator()[abi:v160006]() function.h:185
    #26 0x10785ba23 in std::__1::__function::__func<arrow::acero::QueryContext::ScheduleTask(std::__1::function<arrow::Status (unsigned long)>, std::__1::basic_string_view<char, std::__1::char_traits<char>>)::$_2, std::__1::allocator<arrow::acero::QueryContext::ScheduleTask(std::__1::function<arrow::Status (unsigned long)>, std::__1::basic_string_view<char, std::__1::char_traits<char>>)::$_2>, arrow::Status ()>::operator()() function.h:356
    #27 0x107327daa in std::__1::__function::__value_func<arrow::Status ()>::operator()[abi:v160006]() const function.h:510
    #28 0x107319adf in std::__1::function<arrow::Status ()>::operator()() const function.h:1156
    #29 0x107858bb1 in std::__1::enable_if<!std::is_void<arrow::Status>::value && !is_future<arrow::Status>::value && (!arrow::Future<arrow::internal::Empty>::is_empty || std::is_same<arrow::Status, arrow::Status>::value), void>::type arrow::detail::ContinueFuture::operator()<std::__1::function<arrow::Status ()>&, arrow::Status, arrow::Future<arrow::internal::Empty>>(arrow::Future<arrow::internal::Empty>, std::__1::function<arrow::Status ()>&) const future.h:150
    #30 0x107858966 in decltype(std::declval<arrow::detail::ContinueFuture&>()(std::declval<arrow::Future<arrow::internal::Empty>&>(), std::declval<std::__1::function<arrow::Status ()>&>())) std::__1::__invoke[abi:v160006]<arrow::detail::ContinueFuture&, arrow::Future<arrow::internal::Empty>&, std::__1::function<arrow::Status ()>&>(arrow::detail::ContinueFuture&, arrow::Future<arrow::internal::Empty>&, std::__1::function<arrow::Status ()>&) invoke.h:394
    #31 0x107858869 in std::__1::__bind_return<arrow::detail::ContinueFuture, std::__1::tuple<arrow::Future<arrow::internal::Empty>, std::__1::function<arrow::Status ()>>, std::__1::tuple<>, __is_valid_bind_return<arrow::detail::ContinueFuture, std::__1::tuple<arrow::Future<arrow::internal::Empty>, std::__1::function<arrow::Status ()>>, std::__1::tuple<>>::value>::type std::__1::__apply_functor[abi:v160006]<arrow::detail::ContinueFuture, std::__1::tuple<arrow::Future<arrow::internal::Empty>, std::__1::function<arrow::Status ()>>, 0ul, 1ul, std::__1::tuple<>>(arrow::detail::ContinueFuture&, std::__1::tuple<arrow::Future<arrow::internal::Empty>, std::__1::function<arrow::Status ()>>&, std::__1::__tuple_indices<0ul, 1ul>, std::__1::tuple<>&&) bind.h:263
    #32 0x107858792 in std::__1::__bind_return<arrow::detail::ContinueFuture, std::__1::tuple<arrow::Future<arrow::internal::Empty>, std::__1::function<arrow::Status ()>>, std::__1::tuple<>, __is_valid_bind_return<arrow::detail::ContinueFuture, std::__1::tuple<arrow::Future<arrow::internal::Empty>, std::__1::function<arrow::Status ()>>, std::__1::tuple<>>::value>::type std::__1::__bind<arrow::detail::ContinueFuture, arrow::Future<arrow::internal::Empty>&, std::__1::function<arrow::Status ()>>::operator()[abi:v160006]<>() bind.h:295
    #33 0x107858648 in arrow::internal::FnOnce<void ()>::FnImpl<std::__1::__bind<arrow::detail::ContinueFuture, arrow::Future<arrow::internal::Empty>&, std::__1::function<arrow::Status ()>>>::invoke() functional.h:152
    #34 0x115982869 in arrow::internal::FnOnce<void ()>::operator()() && functional.h:140
    #35 0x115982059 in arrow::internal::SerialExecutor::RunLoop() thread_pool.cc:252
    #36 0x10761ddcb in arrow::Future<arrow::acero::BatchesWithCommonSchema> arrow::internal::SerialExecutor::Run<arrow::acero::BatchesWithCommonSchema, arrow::Result<arrow::acero::BatchesWithCommonSchema>>(arrow::internal::FnOnce<arrow::Future<arrow::acero::BatchesWithCommonSchema> (arrow::internal::Executor*)>) thread_pool.h:420
    #37 0x10761d09c in arrow::Result<arrow::acero::BatchesWithCommonSchema> arrow::internal::SerialExecutor::RunInSerialExecutor<arrow::acero::BatchesWithCommonSchema, arrow::Future<arrow::acero::BatchesWithCommonSchema>, arrow::Result<arrow::acero::BatchesWithCommonSchema>>(arrow::internal::FnOnce<arrow::Future<arrow::acero::BatchesWithCommonSchema> (arrow::internal::Executor*)>) thread_pool.h:300
    #38 0x10757d4a8 in arrow::Future<arrow::acero::BatchesWithCommonSchema>::SyncType arrow::internal::RunSynchronously<arrow::Future<arrow::acero::BatchesWithCommonSchema>, arrow::acero::BatchesWithCommonSchema>(arrow::internal::FnOnce<arrow::Future<arrow::acero::BatchesWithCommonSchema> (arrow::internal::Executor*)>, bool) thread_pool.h:590
    #39 0x10757d16b in arrow::acero::DeclarationToExecBatches(arrow::acero::Declaration, bool, arrow::MemoryPool*, arrow::compute::FunctionRegistry*) exec_plan.cc:878
    #40 0x1040d9e0e in arrow::acero::HashJoinWithExecPlan(arrow::acero::Random64Bit&, bool, arrow::acero::HashJoinNodeOptions const&, std::__1::shared_ptr<arrow::Schema> const&, std::__1::vector<std::__1::shared_ptr<arrow::Array>, std::__1::allocator<std::__1::shared_ptr<arrow::Array>>> const&, std::__1::vector<std::__1::shared_ptr<arrow::Array>, std::__1::allocator<std::__1::shared_ptr<arrow::Array>>> const&, int, int) hash_join_node_test.cc:920
    #41 0x1040e5b0a in arrow::acero::HashJoin_Random_Test::TestBody() hash_join_node_test.cc:1154
    #42 0x10596481f in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*)+0x47 (libarrow_testing.1600.0.0.dylib:x86_64+0x3e881f)
    #43 0x10596478b in testing::Test::Run()+0xc5 (libarrow_testing.1600.0.0.dylib:x86_64+0x3e878b)
    #44 0x105965336 in testing::TestInfo::Run()+0x11e (libarrow_testing.1600.0.0.dylib:x86_64+0x3e9336)
    #45 0x105965cda in testing::TestSuite::Run()+0x1d4 (libarrow_testing.1600.0.0.dylib:x86_64+0x3e9cda)
    #46 0x105970744 in testing::internal::UnitTestImpl::RunAllTests()+0x34c (libarrow_testing.1600.0.0.dylib:x86_64+0x3f4744)
    #47 0x1059702df in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*)+0x47 (libarrow_testing.1600.0.0.dylib:x86_64+0x3f42df)
    #48 0x10597026c in testing::UnitTest::Run()+0x68 (libarrow_testing.1600.0.0.dylib:x86_64+0x3f426c)
    #49 0x1042fea62 in main+0x41 (arrow-acero-hash-join-node-test:x86_64+0x100340a62)
    #50 0x7ff809a61385 in start+0x795 (dyld:x86_64+0xfffffffffff5c385)

0x621000067a44 is located 4 bytes after 4416-byte region [0x621000066900,0x621000067a40)
allocated by thread T0 here:
    #0 0x105e42573 in wrap_posix_memalign+0xb3 (libclang_rt.asan_osx_dynamic.dylib:x86_64h+0xde573)
    #1 0x115086d42 in arrow::(anonymous namespace)::SystemAllocator::AllocateAligned(long long, long long, unsigned char**) memory_pool.cc:323
    #2 0x11508b99b in arrow::BaseMemoryPoolImpl<arrow::(anonymous namespace)::SystemAllocator>::Allocate(long long, long long, unsigned char**) memory_pool.cc:465
    #3 0x115093d53 in arrow::PoolBuffer::Reserve(long long) memory_pool.cc:867
    #4 0x115093215 in arrow::PoolBuffer::Resize(long long, bool) memory_pool.cc:891
    #5 0x11507c129 in arrow::Result<std::__1::unique_ptr<arrow::ResizableBuffer, std::__1::default_delete<arrow::ResizableBuffer>>> arrow::(anonymous namespace)::ResizePoolBuffer<std::__1::unique_ptr<arrow::ResizableBuffer, std::__1::default_delete<arrow::ResizableBuffer>>, std::__1::unique_ptr<arrow::PoolBuffer, std::__1::default_delete<arrow::PoolBuffer>>>(std::__1::unique_ptr<arrow::PoolBuffer, std::__1::default_delete<arrow::PoolBuffer>>&&, long long) memory_pool.cc:931
    #6 0x11507bef5 in arrow::AllocateResizableBuffer(long long, long long, arrow::MemoryPool*) memory_pool.cc:957
    #7 0x1040029b1 in arrow::BufferBuilder::Resize(long long, bool) buffer_builder.h:78
    #8 0x10560a7c4 in arrow::BufferBuilder::Reserve(long long) buffer_builder.h:98
    #9 0x105609fdb in arrow::TypedBufferBuilder<unsigned char, void>::Reserve(long long) buffer_builder.h:291
    #10 0x116a4207a in arrow::Status arrow::compute::internal::(anonymous namespace)::FSBSelectionImpl::GenerateOutput<arrow::compute::internal::(anonymous namespace)::Selection<arrow::compute::internal::(anonymous namespace)::FSBSelectionImpl, arrow::FixedSizeBinaryType>::TakeAdapter<unsigned int>>() vector_selection_internal.cc:581
    #11 0x116a40460 in arrow::compute::internal::(anonymous namespace)::Selection<arrow::compute::internal::(anonymous namespace)::FSBSelectionImpl, arrow::FixedSizeBinaryType>::ExecTake() vector_selection_internal.cc:449
    #12 0x1169dbcf9 in arrow::Status arrow::compute::internal::(anonymous namespace)::TakeExec<arrow::compute::internal::(anonymous namespace)::FSBSelectionImpl>(arrow::compute::KernelContext*, arrow::compute::ExecSpan const&, arrow::compute::ExecResult*) vector_selection_internal.cc:934
    #13 0x1169db94f in arrow::compute::internal::FSBTakeExec(arrow::compute::KernelContext*, arrow::compute::ExecSpan const&, arrow::compute::ExecResult*) vector_selection_internal.cc:949
    #14 0x1162bb1bf in arrow::compute::detail::(anonymous namespace)::VectorExecutor::Exec(arrow::compute::ExecSpan const&, arrow::compute::detail::ExecListener*) exec.cc:1109
    #15 0x1162b95b9 in arrow::compute::detail::(anonymous namespace)::VectorExecutor::Execute(arrow::compute::ExecBatch const&, arrow::compute::detail::ExecListener*) exec.cc:1064
    #16 0x11638b747 in arrow::compute::detail::FunctionExecutorImpl::Execute(std::__1::vector<arrow::Datum, std::__1::allocator<arrow::Datum>> const&, long long) function.cc:277
    #17 0x116366577 in arrow::compute::(anonymous namespace)::ExecuteInternal(arrow::compute::Function const&, std::__1::vector<arrow::Datum, std::__1::allocator<arrow::Datum>>, long long, arrow::compute::FunctionOptions const*, arrow::compute::ExecContext*) function.cc:342
    #18 0x116365b94 in arrow::compute::Function::Execute(std::__1::vector<arrow::Datum, std::__1::allocator<arrow::Datum>> const&, arrow::compute::FunctionOptions const*, arrow::compute::ExecContext*) const function.cc:349
    #19 0x116286992 in arrow::compute::CallFunction(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::vector<arrow::Datum, std::__1::allocator<arrow::Datum>> const&, arrow::compute::FunctionOptions const*, arrow::compute::ExecContext*) exec.cc:1369
    #20 0x116ac88dd in arrow::compute::internal::(anonymous namespace)::TakeAA(std::__1::shared_ptr<arrow::ArrayData> const&, std::__1::shared_ptr<arrow::ArrayData> const&, arrow::compute::TakeOptions const&, arrow::compute::ExecContext*) vector_selection_take_internal.cc:651
    #21 0x116ac7bc8 in arrow::compute::internal::(anonymous namespace)::TakeMetaFunction::ExecuteImpl(std::__1::vector<arrow::Datum, std::__1::allocator<arrow::Datum>> const&, arrow::compute::FunctionOptions const*, arrow::compute::ExecContext*) const vector_selection_take_internal.cc:782
    #22 0x11636a1ce in arrow::compute::MetaFunction::Execute(std::__1::vector<arrow::Datum, std::__1::allocator<arrow::Datum>> const&, arrow::compute::FunctionOptions const*, arrow::compute::ExecContext*) const function.cc:482
    #23 0x116286992 in arrow::compute::CallFunction(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, std::__1::vector<arrow::Datum, std::__1::allocator<arrow::Datum>> const&, arrow::compute::FunctionOptions const*, arrow::compute::ExecContext*) exec.cc:1369
    #24 0x1161da571 in arrow::compute::Take(arrow::Datum const&, arrow::Datum const&, arrow::compute::TakeOptions const&, arrow::compute::ExecContext*) api_vector.cc:372
    #25 0x1040c826e in arrow::acero::TakeUsingVector(arrow::compute::ExecContext*, std::__1::vector<std::__1::shared_ptr<arrow::Array>, std::__1::allocator<std::__1::shared_ptr<arrow::Array>>> const&, std::__1::vector<int, std::__1::allocator<int>>, std::__1::vector<std::__1::shared_ptr<arrow::Array>, std::__1::allocator<std::__1::shared_ptr<arrow::Array>>>*) hash_join_node_test.cc:458
    #26 0x1040cf90e in arrow::acero::GenRandomJoinTables(arrow::compute::ExecContext*, arrow::acero::Random64Bit&, int, int, int, int, int, arrow::acero::RandomDataTypeVector const&, arrow::acero::RandomDataTypeVector const&, arrow::acero::RandomDataTypeVector const&, std::__1::vector<int, std::__1::allocator<int>>*, std::__1::vector<int, std::__1::allocator<int>>*, std::__1::vector<std::__1::shared_ptr<arrow::Array>, std::__1::allocator<std::__1::shared_ptr<arrow::Array>>>*, std::__1::vector<std::__1::shared_ptr<arrow::Array>, std::__1::allocator<std::__1::shared_ptr<arrow::Array>>>*) hash_join_node_test.cc:617
    #27 0x1040e38b2 in arrow::acero::HashJoin_Random_Test::TestBody() hash_join_node_test.cc:1071
    #28 0x10596481f in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*)+0x47 (libarrow_testing.1600.0.0.dylib:x86_64+0x3e881f)
    #29 0x10596478b in testing::Test::Run()+0xc5 (libarrow_testing.1600.0.0.dylib:x86_64+0x3e878b)

SUMMARY: AddressSanitizer: heap-buffer-overflow key_hash_avx2.cc:158 in long long vector[4] arrow::compute::Hashing32::ProcessStripes_avx2<true>(long long, long long, long long vector[4], unsigned char const*, long long, long long)
Shadow bytes around the buggy address:
  0x621000067780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x621000067800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x621000067880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x621000067900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x621000067980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x621000067a00: 00 00 00 00 00 00 00 00[fa]fa fa fa fa fa fa fa
  0x621000067a80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x621000067b00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x621000067b80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x621000067c00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x621000067c80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==85786==ABORTING
[1]    85786 abort      ./debug/arrow-acero-hash-join-node-test --gtest_filter=HashJoin.Random

Component(s)

C++

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions