Skip to content

When combining complex column selections on dataframes, join the selection and push down #6388

@martindurant

Description

@martindurant

We already have an example of column selection pushdown for loading columns from parquet.

  • However, this selection fails for something complex like df.col[df.x > 0 & df.y < 100 & df.y > 0] which should only read three of the source columns, not all of them.
  • We don't do this for other data types where it may be useful (CSV somewhat, ORC definitely, SQL yes but hard - others?)

Metadata

Metadata

Assignees

No one assigned

    Labels

    dataframeneeds attentionIt's been a while since this was pushed on. Needs attention from the owner or a maintainer.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions