Skip to content

Swapping getitem and elementwise operations via graph opimization, redux? #2431

@shoyer

Description

@shoyer

@jcrist did some work back in #755 on swapping getitem and elementwise operations. We didn't end up merging this work, because we were unable to come up with a solution that worked across the board without slowing down every dask array operations (by forcing broadcasting).

However, this is still a vital optimization for climate/weather data analysis use cases (where arrays are often currently loaded with one chunk/file, as described in pydata/xarray#1440), and would be extremely valuable for xarray (allowing us to remove our legacy system for deferred array computation). It would be great to pursue it in some more limited form.

Any of the following would solve xarray's use cases:

  1. Optimization that only works on elementwise operations with a single array argument, e.g., of the form x + 1.
  2. Optimization that only works on elementwise operations with pre-broadcast arguments. These would need some sort of special indication in the dask graph, likely via some custom API, e.g., da.swappable_elementwise().
  3. A new (optional) elementwise operation that always explicitly broadcasts, for cases when optimization is essential and some overhead is acceptable.

The optimization pass itself could also be optional, but we would always want to use it on dask.array objects wrapped with xarray.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions