Skip to content

Should we deprecate gather_statistics in read_parquet? #8937

@jcrist

Description

@jcrist

From @rjzamora in #8899 (comment):

but my preference moving forward may be to deprecate gather_statistics as a public option. Right now, the only time the user should ever specify gather_statistics, is when they are doing it to insist on gather_statistics=False. For the case you are highlighting here, the suggested argument is split_row_groups=True (which doesn't actually require any row-group statistics to be gathered, but does require the parquet metadata to be parsed, as you said).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions