possible bug in cupyx sparse matrix transpose/multiplication

Hello cupy(x) developers,

I would like to report what looks like a bug in the cupyx sparse matrix library. I am running on a Skylake cpu and a Tesla V100 gpu on a SUSE-ish Linux system. CuPy version: 6.0.0. CUDA version 10.1.168. I am running inside a conda environment which you can find in my requirements.txt file: 

[requirements.txt](https://github.com/cupy/cupy/files/3474978/requirements.txt)

Here is a reproducer in which I am trying to compare the same operations on the cpu and gpu and checking at every step that they are the same. Finally in the last step the `np.allclose()` check fails. The `iCov_max_diff` value is much larger than machine precision which suggests there is a bug. 

```
import numpy as np
import scipy.sparse
import cupy as cp
import cupyx as cpx

m = n = 1000

np.random.seed(1)

#create random data to use in W
random_data = np.random.random(m)

#create A random sparse matrix
A_cpu = scipy.sparse.random(m, n, format='csr', random_state=42)
A_gpu = cpx.scipy.sparse.csr_matrix(A_cpu)
#yank gpu back and compare
A_yank = A_gpu.get()
assert np.allclose(A_cpu.todense(),A_yank.todense())

#create W random sparse matrix
W_cpu = scipy.sparse.spdiags(data=random_data, diags=[0,], m=m, n=n)
W_gpu = cpx.scipy.sparse.spdiags(data=random_data, diags=[0,], m=m, n=n)
#yank gpu back and compare
W_yank = W_gpu.get()
assert np.allclose(W_cpu.todense(),W_yank.todense())

#see how the dot products go
W_dot_A_cpu = W_cpu.dot(A_cpu)
W_dot_A_gpu = W_gpu.dot(A_gpu)
#yank gpu back and compare
W_dot_A_yank = W_dot_A_gpu.get()
assert np.allclose(W_dot_A_cpu.todense(), W_dot_A_yank.todense())

#check the transpose
A_trans_cpu = A_cpu.T
A_trans_gpu = A_gpu.T
#yank gpu back and compare
A_trans_yank = A_trans_gpu.get() #use get bc its a sparse object
assert np.allclose(A_trans_cpu.todense(), A_trans_yank.todense())

#okay now inverse covariance (where things go wrong)
iCov_cpu = A_cpu.T.dot(W_dot_A_cpu)
iCov_gpu = A_gpu.T.dot(W_dot_A_gpu)
#yank gpu back and compare
iCov_yank = iCov_gpu.get()

iCov_diff = iCov_cpu.todense() - iCov_yank.todense()
max_iCov_diff = np.max(iCov_diff)
print("max iCov diff")
print(max_iCov_diff)

assert np.allclose(iCov_cpu.todense(), iCov_yank.todense()) #fails for large matrix sizes
```

If there is more information I can provide please let me know. Of course if I am doing something wrong I would also be happy for your feedback about how to do this correctly. 

Thank you very much for your help,
Laurie




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

possible bug in cupyx sparse matrix transpose/multiplication #2365

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

possible bug in cupyx sparse matrix transpose/multiplication #2365

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions