$ pytest test/test_ops.py -k "test_noncontiguous_samples_linalg_pinv_hermitian_cuda_float32"
========================================================================================================================================================================================================== test session starts ===========================================================================================================================================================================================================
platform linux -- Python 3.8.10, pytest-6.2.4, py-1.10.0, pluggy-0.13.1
rootdir: /home/mkozuki/ghq/github.com/crcrpar/torch-1, configfile: pytest.ini
plugins: hypothesis-6.14.1
collected 29688 items / 29687 deselected / 1 selected
test/test_ops.py F [100%]
================================================================================================================================================================================================================ FAILURES ================================================================================================================================================================================================================
______________________________________________________________________________________________________________________________________________________________________________ TestCommonCUDA.test_noncontiguous_samples_linalg_pinv_hermitian_cuda_float32 ______________________________________________________________________________________________________________________________________________________________________________
Traceback (most recent call last):
File "/home/mkozuki/ghq/github.com/crcrpar/torch-1/test/test_ops.py", line 263, in test_noncontiguous_samples
self.assertEqual(actual_grad, expected_grad)
File "/home/mkozuki/ghq/github.com/crcrpar/torch-1/torch/testing/_internal/common_utils.py", line 1903, in assertEqual
super().assertTrue(result, msg=self._get_assert_msg(msg, debug_msg=debug_msg))
File "/home/mkozuki/anaconda3/envs/torch-1/lib/python3.8/unittest/case.py", line 765, in assertTrue
raise self.failureException(msg)
AssertionError: False is not true : Tensors failed to compare as equal!With rtol=1.3e-06 and atol=1e-05, found 50 element(s) (out of 50) whose difference(s) exceeded the margin of error (including 0 nan comparisons). The greatest difference was 2.54345703125 (-4442.3525390625 vs. -4444.89599609375), which occurred at index (1, 3, 3).
======================================================================================================================================================================================================== short test summary info =========================================================================================================================================================================================================
FAILED test/test_ops.py::TestCommonCUDA::test_noncontiguous_samples_linalg_pinv_hermitian_cuda_float32 - AssertionError: False is not true : Tensors failed to compare as equal!With rtol=1.3e-06 and atol=1e-05, found 50 element(s) (out of 50) whose difference(s) exceeded the margin of error (including 0 nan comparisons). The greatest difference was 2.54345703125 (-4442.3525390625 vs. -4444.89599609375), which occurred at...
================================================================================================================================================================================================== 1 failed, 29687 deselected in 5.60s ===================================================================================================================================================================================================
$ NVIDIA_TF32_OVERRIDE=0 pytest test/test_ops.py -k "test_noncontiguous_samples_linalg_pinv_hermitian_cuda_float32"
========================================================================================================================================================================================================== test session starts ===========================================================================================================================================================================================================
platform linux -- Python 3.8.10, pytest-6.2.4, py-1.10.0, pluggy-0.13.1
rootdir: /home/mkozuki/ghq/github.com/crcrpar/torch-1, configfile: pytest.ini
plugins: hypothesis-6.14.1
collected 29688 items / 29687 deselected / 1 selected
test/test_ops.py . [100%]
================================================================================================================================================================================================== 1 passed, 29687 deselected in 5.80s ===================================================================================================================================================================================================
$ python torch/utils/collect_env.py
Collecting environment information...
PyTorch version: 1.10.0a0+git571a2be
Is debug build: False
CUDA used to build PyTorch: 11.4
ROCM used to build PyTorch: N/A
OS: Ubuntu 20.04.2 LTS (x86_64)
GCC version: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Clang version: Could not collect
CMake version: version 3.19.6
Libc version: glibc-2.31
Python version: 3.8.10 (default, Jun 4 2021, 15:09:15) [GCC 7.5.0] (64-bit runtime)
Python platform: Linux-5.8.0-55-generic-x86_64-with-glibc2.17
Is CUDA available: True
CUDA runtime version: 11.4.100
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 3080
Nvidia driver version: 470.63.01
cuDNN version: Probably one of the following:
/usr/local/cuda-11.4/targets/x86_64-linux/lib/libcudnn.so.8.2.2
/usr/local/cuda-11.4/targets/x86_64-linux/lib/libcudnn_adv_infer.so.8.2.2
/usr/local/cuda-11.4/targets/x86_64-linux/lib/libcudnn_adv_train.so.8.2.2
/usr/local/cuda-11.4/targets/x86_64-linux/lib/libcudnn_cnn_infer.so.8.2.2
/usr/local/cuda-11.4/targets/x86_64-linux/lib/libcudnn_cnn_train.so.8.2.2
/usr/local/cuda-11.4/targets/x86_64-linux/lib/libcudnn_ops_infer.so.8.2.2
/usr/local/cuda-11.4/targets/x86_64-linux/lib/libcudnn_ops_train.so.8.2.2
HIP runtime version: N/A
MIOpen runtime version: N/A
Versions of relevant libraries:
[pip3] numpy==1.20.2
[pip3] torch==1.11.0a0+git9e8016d
[conda] blas 1.0 mkl
[conda] magma-cuda111 2.5.2 1 pytorch
[conda] mkl 2021.2.0 h06a4308_296
[conda] mkl-include 2021.2.0 h06a4308_296
[conda] mkl-service 2.3.0 py38h27cfd23_1
[conda] mkl_fft 1.3.0 py38h42c9631_2
[conda] mkl_random 1.2.1 py38ha9443f7_2
[conda] numpy 1.20.2 py38h2d18471_0
[conda] numpy-base 1.20.2 py38hfae3a4d_0
[conda] torch 1.11.0a0+git9e8016d dev_0 <develop>
With TF32,
while without TF32
Expected behavior
pinv_backwarddoes math in FP32 even when TF32 is available.pytorch/torch/csrc/autograd/FunctionsManual.cpp
Lines 1139 to 1163 in 80178d6
Related PRs
Environment
cc @mruberry @zasdfgbnm @ptrblck