Skip to content

Use of da.where before da.einsum on a chunked array produces incorrectly sized result #8131

@timothymillar

Description

@timothymillar

What happened:

Using dask.array.where prior to dask.array.einsum on some chunked arrays results in an incorrectly sized array. This seems to only occur if the array is chunked in the dimensions that are retained by einsum. Using einsum without first using 'where' produces the correct result. Note that this occurs even when 'where' does not change any values within the array.

What you expected to happen:

Use of dask.array.where shouldn't affect the result shape of dask.array.einsum.

Minimal Complete Verifiable Example:

import numpy as np
import dask.array as da
a = da.asarray(
    [
        [1. , 0. ],
        [0. , 1. ],
        [0.5, 0.5],
    ], 
    chunks=((2,1), (2,))
)
b = da.where(a == -1, a, a)
assert a.chunks == b.chunks
np.testing.assert_array_equal(a, b)
da.einsum('xj,yj->xy', a, a).compute()
array([[1. , 0. , 0.5],
       [0. , 1. , 0.5],
       [0.5, 0.5, 0.5]])
da.einsum('xj,yj->xy', b, b).compute()
array([[1. , 0. , 1. , 0. ],
       [0. , 1. , 0. , 1. ],
       [0.5, 0.5, 0.5, 0.5],
       [0.5, 0.5, 0.5, 0.5]])

Environment:

  • Dask version: 2021.9.0 (pypi)
  • Python version: 3.9.6 (conda)
  • Operating System: Ubuntu 21.04
  • Install method: pip within conda env

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions