Skip to content

Consider reactivating low-level DataFrame optimization when not all layers are Blockwise #8447

@gjoseph92

Description

@gjoseph92

Since #7620, we've seen a few instances where users have gotten burned by root-task overproduction (see dask/distributed#5555, dask/distributed#5223 for background) because certain DataFrame optimizations still use low-level graphs, and therefore aren't getting fused anymore. Examples:

We do want to get everything to Blockwise eventually, but our bandwith to track these down and fix them is limited. In the interim, I propose that by default, we still do low-level fusion when any of the layers in the graph are materialized.

cc @rjzamora @ian-r-rose @jrbourbeau

Metadata

Metadata

Assignees

No one assigned

    Labels

    dataframeneeds attentionIt's been a while since this was pushed on. Needs attention from the owner or a maintainer.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions