-
Notifications
You must be signed in to change notification settings - Fork 4.1k
[C++] Enable joins when data contains a list column #31180
Copy link
Copy link
Closed as not planned
Labels
Component: C++Status: stale-warningIssues and PRs flagged as stale which are due to be closed if no indication otherwiseIssues and PRs flagged as stale which are due to be closed if no indication otherwiseType: enhancement
Description
Currently Arrow joins with data that contain a list column errors, even when the list column is not a join key. Here's an example using the R bindings:
library(arrow)
library(dplyr)
jedi <- data.frame(name = c("C-3PO", "Luke Skywalker"),
jedi = c(FALSE, TRUE))
arrow_table(starwars) %>%
left_join(jedi) %>%
collect()
#> Error in `handle_csv_read_error()`:
#> ! Invalid: Data type list<item: string> is not supported in join non-key fieldThe ability to join would be a useful enhancement for workflows with tabular data where list columns can be common, and for geospatial workflows where geometry columns are stored as list or fixed_size_list (thanks @paleolimbot for mentioning that use case).
Related discussion here: ARROW-14519
Reporter: Stephanie Hazlitt / @stephhazlitt
Related issues:
- [C++] joins segfault when data contains list column (relates to)
Note: This issue was originally created as ARROW-15731. Please see the migration documentation for further details.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
Component: C++Status: stale-warningIssues and PRs flagged as stale which are due to be closed if no indication otherwiseIssues and PRs flagged as stale which are due to be closed if no indication otherwiseType: enhancement
Type
Fields
Give feedbackNo fields configured for issues without a type.