-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Is your feature request related to a problem or challenge?
I am trying to create a scalar UDF, pack, which operates on struct arrays. It packs many array into a struct array each with a distinct name
pack(("a", arr1), ("b", arr2), ...) -> struct([("a", arr1.data_type), ("b", arr2.data_type), ...])
This has a data type dependent on the input type and nullability. In the method ScalarUDFImpl::invoke I want to return an a struct array with each field having the data type and nullability of the input, however the invoke function only gives the data type of the array not the nullability of the record batch or intermediate children expressions.
I have returned this type information from return_type_from_exprs, I just need to access this in the stateless scalar udf impl.
Describe the solution you'd like
I would like add a new ScalarUDFImpl::invoke_with_data_type (or invoke_with_return_type) method which is given both the evaluated children array (as previously) and also either the previously returned type (from return_type_from_exprs) or the arguments already passed to return_type_from_exprs which could be re-evaluated by invoke. I am open to either, I guess the former seems more performant.
Describe alternatives you've considered
No response
Additional context
I believe this would be a small non-breaking, change, that I am happy to contribute.
Any ideas?