🐛 Describe the bug
After having issues reproducing results, my collaborators and I ran small models on many different environments, GPUs, and package versions, and found that there is a definite performance change when using nn.Dropout2d on torch=1.11 compared to torch=1.10. For example, for one of our small models with 100K parameters and dropout=0.1, the version on torch 1.11 is around 3% worse than the version on torch 1.10, and this gap begins within the first few epochs and persists throughout training for 100 epochs. No other hyperparameters in this model seem to cause performance differences between the torch versions. The performance differences are consistent across several different environments (e.g. Google Cloud and our local cluster) and GPUs (T4, P100, A100).
I don't have time to create a minimal reproducible example right now, but can do so later if that is helpful. For now I just want to understand the difference in implementation: what changed for Dropout2d between 1.10 and 1.11?
Versions
Collecting environment information...
PyTorch version: 1.11.0
Is debug build: False
CUDA used to build PyTorch: 10.2
ROCM used to build PyTorch: N/A
OS: Ubuntu 16.04.6 LTS (x86_64)
GCC version: (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609
Clang version: 3.8.0-2ubuntu4 (tags/RELEASE_380/final)
CMake version: version 3.21.0
Libc version: glibc-2.23
Python version: 3.8.12 (default, Oct 12 2021, 13:49:34) [GCC 7.5.0] (64-bit runtime)
Python platform: Linux-4.4.0-130-generic-x86_64-with-glibc2.17
Is CUDA available: True
CUDA runtime version: 10.2.89
GPU models and configuration: GPU 0: Tesla P100-PCIE-16GB
Nvidia driver version: 430.26
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
Versions of relevant libraries:
[pip3] numpy==1.21.5
[pip3] pytorch-fast-transformers==0.4.0
[pip3] pytorch-lightning==1.5.0
[pip3] torch==1.11.0
[pip3] torchaudio==0.11.0
[pip3] torchmetrics==0.6.0
[pip3] torchtext==0.11.0
[pip3] torchvision==0.12.0
[conda] blas 1.0 mkl
[conda] cudatoolkit 10.2.89 hfd86e86_1
[conda] ffmpeg 4.3 hf484d3e_0 pytorch
[conda] mkl 2021.4.0 h06a4308_640
[conda] mkl-service 2.4.0 py38h7f8727e_0
[conda] mkl_fft 1.3.1 py38hd3c417c_0
[conda] mkl_random 1.2.2 py38h51133e4_0
[conda] numpy 1.20.3 pypi_0 pypi
[conda] numpy-base 1.21.5 py38hf524024_2
[conda] pytorch 1.11.0 py3.8_cuda10.2_cudnn7.6.5_0 pytorch
[conda] pytorch-fast-transformers 0.4.0 pypi_0 pypi
[conda] pytorch-lightning 1.5.0 pypi_0 pypi
[conda] pytorch-mutex 1.0 cuda pytorch
[conda] torch 1.10.0 pypi_0 pypi
[conda] torchaudio 0.11.0 py38_cu102 pytorch
[conda] torchmetrics 0.6.0 pypi_0 pypi
[conda] torchtext 0.11.0 pypi_0 pypi
[conda] torchvision 0.12.0 py38_cu102 pytorch
cc @ezyang @gchanan @zou3519 @ngimel @pbelevich @mruberry @kurtamohler
🐛 Describe the bug
After having issues reproducing results, my collaborators and I ran small models on many different environments, GPUs, and package versions, and found that there is a definite performance change when using
nn.Dropout2dontorch=1.11compared totorch=1.10. For example, for one of our small models with 100K parameters and dropout=0.1, the version on torch 1.11 is around 3% worse than the version on torch 1.10, and this gap begins within the first few epochs and persists throughout training for 100 epochs. No other hyperparameters in this model seem to cause performance differences between the torch versions. The performance differences are consistent across several different environments (e.g. Google Cloud and our local cluster) and GPUs (T4, P100, A100).I don't have time to create a minimal reproducible example right now, but can do so later if that is helpful. For now I just want to understand the difference in implementation: what changed for Dropout2d between 1.10 and 1.11?
Versions
Collecting environment information...
PyTorch version: 1.11.0
Is debug build: False
CUDA used to build PyTorch: 10.2
ROCM used to build PyTorch: N/A
OS: Ubuntu 16.04.6 LTS (x86_64)
GCC version: (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609
Clang version: 3.8.0-2ubuntu4 (tags/RELEASE_380/final)
CMake version: version 3.21.0
Libc version: glibc-2.23
Python version: 3.8.12 (default, Oct 12 2021, 13:49:34) [GCC 7.5.0] (64-bit runtime)
Python platform: Linux-4.4.0-130-generic-x86_64-with-glibc2.17
Is CUDA available: True
CUDA runtime version: 10.2.89
GPU models and configuration: GPU 0: Tesla P100-PCIE-16GB
Nvidia driver version: 430.26
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
Versions of relevant libraries:
[pip3] numpy==1.21.5
[pip3] pytorch-fast-transformers==0.4.0
[pip3] pytorch-lightning==1.5.0
[pip3] torch==1.11.0
[pip3] torchaudio==0.11.0
[pip3] torchmetrics==0.6.0
[pip3] torchtext==0.11.0
[pip3] torchvision==0.12.0
[conda] blas 1.0 mkl
[conda] cudatoolkit 10.2.89 hfd86e86_1
[conda] ffmpeg 4.3 hf484d3e_0 pytorch
[conda] mkl 2021.4.0 h06a4308_640
[conda] mkl-service 2.4.0 py38h7f8727e_0
[conda] mkl_fft 1.3.1 py38hd3c417c_0
[conda] mkl_random 1.2.2 py38h51133e4_0
[conda] numpy 1.20.3 pypi_0 pypi
[conda] numpy-base 1.21.5 py38hf524024_2
[conda] pytorch 1.11.0 py3.8_cuda10.2_cudnn7.6.5_0 pytorch
[conda] pytorch-fast-transformers 0.4.0 pypi_0 pypi
[conda] pytorch-lightning 1.5.0 pypi_0 pypi
[conda] pytorch-mutex 1.0 cuda pytorch
[conda] torch 1.10.0 pypi_0 pypi
[conda] torchaudio 0.11.0 py38_cu102 pytorch
[conda] torchmetrics 0.6.0 pypi_0 pypi
[conda] torchtext 0.11.0 pypi_0 pypi
[conda] torchvision 0.12.0 py38_cu102 pytorch
cc @ezyang @gchanan @zou3519 @ngimel @pbelevich @mruberry @kurtamohler