🐛 Bug
The return of torch.inverse contains nan sometime.
AssertionError: tensor([[[ nan, -0.8982, -0.0000],
[ nan, -0.4397, 0.0000],
[ nan, 0.0000, 1.0000]],
[[-0.4397, -0.8982, -0.0000],
[ 0.8982, -0.4397, 0.0000],
[ 0.0000, 0.0000, 1.0000]]], device='cuda:0')
84
To Reproduce
Steps to reproduce the behavior:
import torch
device = torch.device('cuda:0')
d = torch.tensor([
[[-0.4397, 0.8981, 0.0000], [-0.8981, -0.4397, 0.0000], [ 0.0000, 0.0000, 1.0000]],
[[-0.4397, 0.8981, 0.0000], [-0.8981, -0.4397, 0.0000], [ 0.0000, 0.0000, 1.0000]]
], device=device)
count = 0
while True:
temp = torch.inverse(d)
count = count + 1
assert not torch.isnan(temp).any(), str(temp) + '\n' + str(count)
Expected behavior
Return accurate results.
Environment
I got error in two environments:
PyTorch version: 1.7.0
Is debug build: True
CUDA used to build PyTorch: 10.2
ROCM used to build PyTorch: N/A
OS: Microsoft Windows 10 企业版
GCC version: Could not collect
Clang version: Could not collect
CMake version: Could not collect
Python version: 3.6 (64-bit runtime)
Is CUDA available: True
CUDA runtime version: 10.2.89
GPU models and configuration: Could not collect
Nvidia driver version: Could not collect
cuDNN version: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\bin\cudnn64_7.dll
HIP runtime version: N/A
MIOpen runtime version: N/A
Versions of relevant libraries:
[pip3] msgpack-numpy==0.4.7.1
[pip3] numpy==1.19.2
[pip3] robust-loss-pytorch==0.0.2
[pip3] torch==1.7.0
[pip3] torch-cluster==1.5.4
[pip3] torch-dct==0.1.5
[pip3] torch-geometric==1.6.1
[pip3] torch-scatter==2.0.4
[pip3] torch-sparse==0.6.3
[pip3] torch-spline-conv==1.2.0
[pip3] torchaudio==0.7.0
[pip3] torchfile==0.1.0
[pip3] torchnet==0.0.4
[pip3] torchstat==0.0.7
[pip3] torchvision==0.8.1
[conda] Could not collect
PyTorch version: 1.7.0+cu101
Is debug build: Yes
CUDA used to build PyTorch: 10.1
OS: Ubuntu 16.04.1 LTS
GCC version: (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609
CMake version: version 3.5.1
Python version: 3.7
Is CUDA available: Yes
CUDA runtime version: 10.1.243
GPU models and configuration:
GPU 0: GeForce RTX 2080 Ti
GPU 1: GeForce RTX 2080 Ti
GPU 2: GeForce RTX 2080 Ti
GPU 3: GeForce RTX 2080 Ti
GPU 4: GeForce RTX 2080 Ti
GPU 5: GeForce RTX 2080 Ti
GPU 6: GeForce RTX 2080 Ti
GPU 7: GeForce RTX 2080 Ti
Nvidia driver version: 430.64
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.7.4.2
/usr/local/cuda-10.1/targets/x86_64-linux/lib/libcudnn.so.7.6.5
Versions of relevant libraries:
[pip3] numpy==1.19.1
[pip3] numpydoc==0.9.2
[pip3] robust-loss-pytorch==0.0.2
[pip3] torch==1.7.0+cu101
[pip3] torch-dct==0.1.5
[pip3] torchaudio==0.7.0
[pip3] torchvision==0.8.1+cu101
[conda] blas 1.0 mkl
[conda] cudatoolkit 10.2.89 hfd86e86_1
[conda] mkl 2020.0 166
[conda] mkl-service 2.3.0 py37he904b0f_0
[conda] mkl_fft 1.0.15 py37ha843d7b_0
[conda] mkl_random 1.1.0 py37hd6b4f25_0
[conda] numpy 1.19.1 pypi_0 pypi
[conda] numpydoc 0.9.2 py_0
[conda] torch 1.7.0+cu101 pypi_0 pypi
[conda] torch-dct 0.1.5 pypi_0 pypi
[conda] torchaudio 0.7.0 pypi_0 pypi
[conda] torchvision 0.8.1+cu101 pypi_0 pypi
cc @vishwakftw @jianyuh @nikitaved @pearu @mruberry @heitorschueroff
🐛 Bug
The return of torch.inverse contains nan sometime.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Return accurate results.
Environment
I got error in two environments:
cc @vishwakftw @jianyuh @nikitaved @pearu @mruberry @heitorschueroff