Skip to content

[R] printing data in Table/RecordBatch print method #32110

@asfimport

Description

@asfimport

Related to ARROW-16776 but after a brief discussion with Neal Richardson, he requested that I split the improvement request into separate issues.

When working with Arrow datasets/tables, I often find myself wanting to interactively print or "see" the results of a query or the first few rows of the data without having to fully collect into memory.

It would be ideal to lazily print some data with Table/RecordBatch print methods, however, currently, the print methods return schema without data.

IE:

library(dplyr)
library(arrow)

mtcars %>% arrow::write_parquet("mtcars.parquet")
car_ds <- arrow::open_dataset("mtcars.parquet")

car_ds
#> FileSystemDataset with 1 Parquet file
#> mpg: double
#> cyl: double
#> disp: double
#> hp: double
#> drat: double
#> wt: double
#> qsec: double
#> vs: double
#> am: double
#> gear: double
#> carb: double
#> 
#> See $metadata for additional Schema metadata

car_ds %>%
  compute()
#> Table
#> 32 rows x 11 columns
#> $mpg <double>
#> $cyl <double>
#> $disp <double>
#> $hp <double>
#> $drat <double>
#> $wt <double>
#> $qsec <double>
#> $vs <double>
#> $am <double>
#> $gear <double>
#> $carb <double>
#> 
#> See $metadata for additional Schema metadata

Reporter: Thomas Mock

Related issues:

Note: This issue was originally created as ARROW-16777. Please see the migration documentation for further details.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions