Skip to content

Access children DataType or return-type in ScalarUDFImpl::invoke #12819

@joseph-isaacs

Description

@joseph-isaacs

Is your feature request related to a problem or challenge?

I am trying to create a scalar UDF, pack, which operates on struct arrays. It packs many array into a struct array each with a distinct name

pack(("a", arr1), ("b", arr2), ...) -> struct([("a", arr1.data_type), ("b", arr2.data_type), ...])

This has a data type dependent on the input type and nullability. In the method ScalarUDFImpl::invoke I want to return an a struct array with each field having the data type and nullability of the input, however the invoke function only gives the data type of the array not the nullability of the record batch or intermediate children expressions.

I have returned this type information from return_type_from_exprs, I just need to access this in the stateless scalar udf impl.

Describe the solution you'd like

I would like add a new ScalarUDFImpl::invoke_with_data_type (or invoke_with_return_type) method which is given both the evaluated children array (as previously) and also either the previously returned type (from return_type_from_exprs) or the arguments already passed to return_type_from_exprs which could be re-evaluated by invoke. I am open to either, I guess the former seems more performant.

Describe alternatives you've considered

No response

Additional context

I believe this would be a small non-breaking, change, that I am happy to contribute.

Any ideas?

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions