Skip to content

Comet named_struct fails on duplicate field names #1015

@viirya

Description

@viirya

Describe the bug

Spark named_struct expression doesn't forbid duplicate field names. Comet named_struct implementation follows Spark.
However, Java Arrow when importing arrays, it binds arrays by field name. So if you have a struct array with duplicate field names, an error like this will be thrown:

  org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 5959.0 failed 1 times, most recent failure: Lost task 0.0 in stage 5959.0 (TID 15003) (192.168.86.44 executor driver): java.lang.IllegalStateExcep
tion: ArrowArray struct has 2 children (expected 1)                                                                

Steps to reproduce

No response

Expected behavior

No response

Additional context

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions