Skip to content

Optimize set-associative nested function with Eq KernelΒ #12163

@jayzhan211

Description

@jayzhan211

Is your feature request related to a problem or challenge?

We found that computing with Eq Kernel is much faster than RowConverter for array_has

There are other functions that have potential to further speedup with Eq Kernel.

  • array_has_all
  • array_has_any
  • array_intersect
  • array_distinct (Maybe πŸ€” ?)
  • array_except (Maybe πŸ€” ?)

Describe the solution you'd like

The overall idea is that we flatten the left hand side of the list and iterate right hand side elements, apply Eq kernel for each element. #12062 We could know whether the element is in left hand side of the list by checking the true_count (and null_count for null handling). The eq kernel is vectorized, that is the key of speedup.

Example

array_has_all

array_has_all([1,2,3], [1,2]) -> true
Iterate [1,2], compare [1,2,3] with 1 and [1,2,3] with 2. We will get [true,false,false] and [false,true,false]. Both boolean array contains true, therefore return true

array_has_any

array_has_any([1,2,3], [1,4]) -> true
Iterate [1,2], compare [1,2,3] with 1 and [1,2,3] with 4. We will get [true,false,false] and [false,false,false]. Since first boolean array contains true, therefore return true

array_has_intersect

array_has_intersect([1,2,3], [1,2]) -> [1,2]
The same idea like above, we know that both element is contained in the list. The difference is that we expect to return Array. We could get the expected array with MutableArrayData

I'm not pretty sure about distinct and except, but worth to figure out as well

For more detail implementation could take array_has as reference

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions