Skip to content

[C++] joins segfault when data contains list column #30074

@asfimport

Description

@asfimport

When I run the R code below, it results in a segfault if one of the tables contains a list column.

library(arrow)
library(dplyr)

basic_tbl <- arrow_table(
  tibble::tibble(
    x = 1:3,
    y = c("a", "b", "c")
  )
)

basic_tbl2 <- arrow_table(
  tibble::tibble(
    x = 1:3,
    z = c(T, F, T)
  )
)

list_tbl <- arrow_table(
  tibble::tibble(
    z = list(c("first", "list", "col", "row"), c("second row ", "here")),
    x = 1:2
  )
)

# works
left_join(basic_tbl, basic_tbl2) %>%
  collect()

# segfaults
left_join(basic_tbl, list_tbl) %>%
  collect()

Reporter: Nicola Crane / @thisisnic
Assignee: David Li / @lidavidm

Related issues:

PRs and other links:

Note: This issue was originally created as ARROW-14519. Please see the migration documentation for further details.

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions