-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Open
Labels
bugSomething isn't workingSomething isn't workingperformanceMake DataFusion fasterMake DataFusion faster
Description
Describe the bug
The ScalarValue::to_array_of_size() API is repeating a scalar value k times and convert it into a array of length k. For ScalarValue::List type with inner list type Utf8View, now it's doing deep copy for the Utf8View buffers.
Note copying the inner Utf8View array is inevitable due to the ListArray encoding, but deep copying the payload buffers can be avoided, only the views should be copied.
See the List implementation in
datafusion/datafusion/common/src/scalar/mod.rs
Line 2877 in 35b2e35
| pub fn to_array_of_size(&self, size: usize) -> Result<ArrayRef> { |
This caused performance regression in #18070
To Reproduce
It can be verified easier by checking the implementation.
Expected behavior
Doing shallow copy on List elements
Additional context
No response
alamb
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingperformanceMake DataFusion fasterMake DataFusion faster