-
Notifications
You must be signed in to change notification settings - Fork 4k
Description
For scalar functions, ExecBatchIterator is used to iterate over batches in smaller units. It is implemented by calling {}Array::slice(){}. For small batches, this is unecessary, since only one slice is created. The slice operation still causes some overhead by copying the shrared_ptrs of the ArrayData object, inclung the type pointer, which can lead to contention (ARROW-16161).
This Patch checks if the batch size is smaller than the slice size first, and uses std::move in this case.
I have attached a comparision of the ExecuteScalarExpressionOverhead benchmark here: avoid-slicing-performance.txt
(created with --benchmark_min_time=20, the standard low runtime tends to be noisy with this, but also shows a positive tendency)
Reporter: Tobias Zagorni / @zagto
Assignee: Tobias Zagorni / @zagto
Related issues:
Original Issue Attachments:
PRs and other links:
Note: This issue was originally created as ARROW-16562. Please see the migration documentation for further details.