Skip to content

Allocating pinned memory uses twice as much memory, causing OOM with larger allocations #95823

@akamali

Description

@akamali

🐛 Describe the bug

After PyTorch 1.11 it appears allocating pinned memory results in using twice as much memory on the host. See the example code below:

import torch

def get_free():
    import subprocess
    r = subprocess.run(["free", "-m"], capture_output=True)
    d = r.stdout.decode('utf-8')
    s = d.split(':')[1].split()
    return f"[used={s[1]:7}, shared={s[3]:7}] "

pin=False
print(get_free() + f"Starting torch={torch.__version__} with pin_memory={pin}")

gpu = torch.rand(4347592704, dtype = torch.float16, device='cuda')
cpu = torch.empty(4347592704, dtype = torch.float16, device='cpu', pin_memory=pin)

print(get_free() + "Copying")
cpu.storage().copy_(gpu.storage(), non_blocking=False)
print(get_free() + "Copy finished")

On PyTorch 1.11 I get the following output with pin=True and pin=False:

[used=57562  , shared=12     ] Starting torch=1.10.1 with pin_memory=True
[used=59212  , shared=8314   ] Copying
[used=59211  , shared=8314   ] Copy finished

[used=57557  , shared=12     ] Starting torch=1.10.1 with pin_memory=False
[used=59021  , shared=22     ] Copying
[used=67348  , shared=22     ] Copy finished

On 1.11 and above I get:

[used=59913  , shared=12     ] Starting torch=1.13.1+cu117 with pin_memory=True
[used=61702  , shared=16406  ] Copying
[used=61702  , shared=16406  ] Copy finished

[used=59920  , shared=12     ] Starting torch=1.13.1+cu117 with pin_memory=False
[used=61396  , shared=22     ] Copying
[used=69722  , shared=22     ] Copy finished

The problem appears to come from https://github.com/pytorch/pytorch/pull/69299/files#diff-5d3beb56bf9b6f380f91d5f6f063480ce2e14ca15c415d59d153436018089223R179

C10_CUDA_CHECK(cudaHostAlloc(
        &ptr, c10::llvm::PowerOf2Ceil(size), cudaHostAllocDefault));

If requested size is 4.5GB the above code will round it up to 8GB.

Versions

PyTorch version: 1.13.1+cu117
Is debug build: False

cc @ngimel

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: cudaRelated to torch.cuda, and CUDA support in generaltriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions