conv1d, conv2d, etc. causing segmentation fault on torch 1.8.0

## 🐛 Bug

When using a DataLoader with one or more worker subprocesses, calling `F.conv1d()` in the `__getitem__()` function of the Dataset instance can cause a segmentation fault if `F.conv1d()` is also called in the `__init__()` function.

## To Reproduce

The bug should be reproducible with the following MWE:
```
import torch
import torch.nn.functional as F
from torch.utils.data import Dataset, DataLoader


class MyDataset(Dataset):
    def __init__(self):
        self[0]  # The important thing is that conv1d is called here

    def __getitem__(self, index):
        x = torch.Tensor(1, 1, 24000)  # Needs to be long enough
        x = F.conv1d(x, torch.ones(1, 1, 2))  # Causes segfault
        return x

    def __len__(self):
        return 1


# num_workers>0 necessary to reproduce error
loader = DataLoader(MyDataset(), num_workers=1)
for x in loader:
    pass
```

For the segmentation fault to be thrown, note that
- `F.conv1d()`* must be called (directly or indirectly) in `MyDataset.__init__()`.
- The tensor `x` must be long enough.
- The size of the kernel must at least be 2.
- The DataLoader must use one or more worker subprocesses.

*Replacing conv1d with conv2d also produces the bug.

## Expected behavior

The above code should not result in a segmentation fault. This did not occur in torch 1.7.

## Environment

```
Collecting environment information...
PyTorch version: 1.8.0
Is debug build: False
CUDA used to build PyTorch: 10.2
ROCM used to build PyTorch: N/A

OS: Ubuntu 18.04.4 LTS (x86_64)
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Clang version: 6.0.0-1ubuntu2 (tags/RELEASE_600/final)
CMake version: version 3.10.2

Python version: 3.8 (64-bit runtime)
Is CUDA available: False
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.20.1
[pip3] torch==1.8.0
[pip3] torchaudio==0.8.0
[conda] Could not collect
```

## Additional context

You may be wondering why `F.conv1d()` would even be called in the `__init__()` function. In the original code, I have a `CachedDataset` class that caches the dataset elements of another Dataset instance to system memory on-the-fly. In order to allocate the correct amount of system memory, I take a look at one of the dataset elements. If, by retrieving the dataset element, `F.conv1d()` (or similar) is called, a segmentation fault will later occur.


cc @ezyang @gchanan @zou3519 @bdhirsh @jbschlosser @anjali411 @gujinghui @PenghuiCheng @XiaobingSuper @jianyuh @VitalyFedyunin

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

conv1d, conv2d, etc. causing segmentation fault on torch 1.8.0 #53565

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

conv1d, conv2d, etc. causing segmentation fault on torch 1.8.0 #53565

Description

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions