Skip to content

array_remove with structs causes error at runtime #1307

@andygrove

Description

@andygrove

Describe the bug

I added the following test to the existing "array_remove" test in CometExpressionSuite:

        sql("SELECT array(struct(_1, _2)) as a, struct(_1, _2) as b FROM t1")
          .createOrReplaceTempView("t2")
        checkSparkAnswerAndOperator(sql("SELECT array_remove(a, b) FROM t2"))

The query fails at runtime:

org.apache.comet.CometNativeException: Invalid argument error: Nested comparison: Struct([Field { name: "_1", data_type: Boolean, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "_2", data_type: Int8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]) IS DISTINCT FROM Struct([Field { name: "_1", data_type: Boolean, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "_2", data_type: Int8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]) (hint: use make_comparator instead)

Steps to reproduce

No response

Expected behavior

We should only attempt to run expressions natively when we know that the input types are supported. This is a general problem that we have and is not specific to array_remove.

I am going to create a PR to fix this specific issue and to suggest a general approach for us to avoid these kind of issues.

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions