Skip to content

[C++] Support joining tables with non-key fields as list #32504

@asfimport

Description

@asfimport

I am trying to join 2 Arrow tables where some columns are of list<float> data type. Note that my join columns/keys are primitive data types and some my non-join columns/keys are of {}list<float>{}. But, PyArrow join() cannot join such as table, although pandas can. It says

ArrowInvalid: Data type list<item: float> is not supported in join non-key field

when I execute this piece of code

joined_table = table_1.join(table_2, ['k1', 'k2', 'k3'])

A stackoverflow response pointed out that Arrow currently cannot handle non-fixed types for joins. Can this be fixed ? Or is this intentional ?

Reporter: Jayjeet Chakraborty / @JayjeetAtGithub

Related issues:

Note: This issue was originally created as ARROW-17216. Please see the migration documentation for further details.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions