🐛 Bug
I got non-deterministic results when I run my model with nn.LSTM with its dropout > 0 on GPU, even when I seeded everything and torch.backends.cudnn.deterministic = True. Also, if I set torch.backends.cudnn.enabled = False, the results are deterministic.
To Reproduce
Steps to reproduce the behavior:
-
torch.backends.cudnn.deterministic = True
random.seed(1)
torch.manual_seed(1)
torch.cuda.manual_seed_all(1)
np.random.seed(1)
-
define a module as:
nn.LSTM(input_size=256,
hidden_size=256,
num_layers=3,
dropout=0.1,
bidirectional=True,
)
-
training with the defined module multiple times
Expected behavior
The training should be deterministic across different runs
Environment
PyTorch version: 1.0.1.post2
Is debug build: No
CUDA used to build PyTorch: 9.0.176
OS: Debian GNU/Linux 9.4 (stretch)
GCC version: (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
CMake version: version 3.7.2
Python version: 3.7
Is CUDA available: Yes
CUDA runtime version: Could not collect
GPU models and configuration:
GPU 0: Tesla K80
GPU 1: Tesla K80
GPU 2: Tesla K80
GPU 3: Tesla K80
Nvidia driver version: 387.26
cuDNN version: Could not collect
Versions of relevant libraries:
[pip3] numpy==1.12.1
[conda] blas 1.0 mkl
[conda] mkl 2019.1 144
[conda] mkl-service 1.1.2 py37h90e4bf4_5
[conda] mkl_fft 1.0.10 py37ha843d7b_0
[conda] mkl_random 1.0.2 py37hd81dba3_0
[conda] pytorch 1.0.1 py3.7_cuda9.0.176_cudnn7.4.2_2 pytorch
[conda] torchvision 0.2.2 py_3 pytorch
Additional context
🐛 Bug
I got non-deterministic results when I run my model with nn.LSTM with its dropout > 0 on GPU, even when I seeded everything and torch.backends.cudnn.deterministic = True. Also, if I set torch.backends.cudnn.enabled = False, the results are deterministic.
To Reproduce
Steps to reproduce the behavior:
torch.backends.cudnn.deterministic = True
random.seed(1)
torch.manual_seed(1)
torch.cuda.manual_seed_all(1)
np.random.seed(1)
define a module as:
nn.LSTM(input_size=256,
hidden_size=256,
num_layers=3,
dropout=0.1,
bidirectional=True,
)
training with the defined module multiple times
Expected behavior
The training should be deterministic across different runs
Environment
PyTorch version: 1.0.1.post2
Is debug build: No
CUDA used to build PyTorch: 9.0.176
OS: Debian GNU/Linux 9.4 (stretch)
GCC version: (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
CMake version: version 3.7.2
Python version: 3.7
Is CUDA available: Yes
CUDA runtime version: Could not collect
GPU models and configuration:
GPU 0: Tesla K80
GPU 1: Tesla K80
GPU 2: Tesla K80
GPU 3: Tesla K80
Nvidia driver version: 387.26
cuDNN version: Could not collect
Versions of relevant libraries:
[pip3] numpy==1.12.1
[conda] blas 1.0 mkl
[conda] mkl 2019.1 144
[conda] mkl-service 1.1.2 py37h90e4bf4_5
[conda] mkl_fft 1.0.10 py37ha843d7b_0
[conda] mkl_random 1.0.2 py37hd81dba3_0
[conda] pytorch 1.0.1 py3.7_cuda9.0.176_cudnn7.4.2_2 pytorch
[conda] torchvision 0.2.2 py_3 pytorch
Additional context