Skip to content

Deprecate "pyarrow-legacy" engine in dask.dataframe.read_parquet #8243

@rjzamora

Description

@rjzamora

Given the pyarrow version requirements in Dask, we can now assume the dataset API will be supported if pyarrow is installed. We can also assume that deprecation warnings will be raised by the pyarrow backend if dd.read_parquet(..., engine="pyarrow-legacy") is used. Therefore, I propose that we add an explicit deprecation warning for the "pyarrow-legacy" engine itself, and establish a timeline for its removal.

Deprecating and removing "pyarrow-legacy" should simplifyread_parquet maintenance, and should have few (if any) downsides. However, I welcome pushback if others expect this to cause pain or problems.

Metadata

Metadata

Assignees

Labels

dataframeioneeds attentionIt's been a while since this was pushed on. Needs attention from the owner or a maintainer.parquet

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions