Skip to content

Incorrect maths with tensordot for sparse arrays constructed with map_blocks #6907

@stephenworsley

Description

@stephenworsley

What happened:

When applying the tensordot function to a chunked dask array which has been made into a sparse array, the result will be incorrect.

import numpy as np
import dask.array as da
import sparse
x = np.array(
    [
        [1, 0, 0, 0],
        [0, 1, 0, 0],
        [0, 0, 1, 0],
        [0, 1, 0, 1],
    ]
)

x_chunked = da.from_array(x, chunks=(2, 2))
x_sparse_chunked = x_chunked.map_blocks(sparse.COO)

result = da.tensordot(x_sparse_chunked, x_sparse_chunked, axes=1).compute().todense()
print(result)

What you expected to happen:

This should yield

[[1 0 0 0]
 [0 1 0 0]
 [0 0 1 0]
 [0 0 0 1]]

but instead yields

[[1 0 1 0]
 [0 1 0 1]
 [1 0 1 0]
 [0 1 0 1]]

It's worth noting that this bug does not happen with regular sparse arrays or if the chunked sparse array is constructed as follows:

x = np.array(
    [
        [1, 0, 0, 0],
        [0, 1, 0, 0],
        [0, 0, 1, 0],
        [0, 1, 0, 1],
    ]
)
x_sparse = sparse.COO(x)
x_sparse_chunked = da.from_array(x_sparse, chunks=(2, 2))

I'm not sure if this is a problem with dask or with sparse, though the fact that the behaviour changes when constructing the array with map_blocks vs converting a sparse array to dask makes me think that dask may be involved.
Environment:

Dask 2.30.0
Sparse 0.11.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions