🐛 Describe the bug
After PyTorch 1.11 it appears allocating pinned memory results in using twice as much memory on the host. See the example code below:
import torch
def get_free():
import subprocess
r = subprocess.run(["free", "-m"], capture_output=True)
d = r.stdout.decode('utf-8')
s = d.split(':')[1].split()
return f"[used={s[1]:7}, shared={s[3]:7}] "
pin=False
print(get_free() + f"Starting torch={torch.__version__} with pin_memory={pin}")
gpu = torch.rand(4347592704, dtype = torch.float16, device='cuda')
cpu = torch.empty(4347592704, dtype = torch.float16, device='cpu', pin_memory=pin)
print(get_free() + "Copying")
cpu.storage().copy_(gpu.storage(), non_blocking=False)
print(get_free() + "Copy finished")
On PyTorch 1.11 I get the following output with pin=True and pin=False:
[used=57562 , shared=12 ] Starting torch=1.10.1 with pin_memory=True
[used=59212 , shared=8314 ] Copying
[used=59211 , shared=8314 ] Copy finished
[used=57557 , shared=12 ] Starting torch=1.10.1 with pin_memory=False
[used=59021 , shared=22 ] Copying
[used=67348 , shared=22 ] Copy finished
On 1.11 and above I get:
[used=59913 , shared=12 ] Starting torch=1.13.1+cu117 with pin_memory=True
[used=61702 , shared=16406 ] Copying
[used=61702 , shared=16406 ] Copy finished
[used=59920 , shared=12 ] Starting torch=1.13.1+cu117 with pin_memory=False
[used=61396 , shared=22 ] Copying
[used=69722 , shared=22 ] Copy finished
The problem appears to come from https://github.com/pytorch/pytorch/pull/69299/files#diff-5d3beb56bf9b6f380f91d5f6f063480ce2e14ca15c415d59d153436018089223R179
C10_CUDA_CHECK(cudaHostAlloc(
&ptr, c10::llvm::PowerOf2Ceil(size), cudaHostAllocDefault));
If requested size is 4.5GB the above code will round it up to 8GB.
Versions
PyTorch version: 1.13.1+cu117
Is debug build: False
cc @ngimel
🐛 Describe the bug
After PyTorch 1.11 it appears allocating pinned memory results in using twice as much memory on the host. See the example code below:
On PyTorch 1.11 I get the following output with
pin=Trueandpin=False:On 1.11 and above I get:
The problem appears to come from https://github.com/pytorch/pytorch/pull/69299/files#diff-5d3beb56bf9b6f380f91d5f6f063480ce2e14ca15c415d59d153436018089223R179
If requested size is 4.5GB the above code will round it up to 8GB.
Versions
PyTorch version: 1.13.1+cu117
Is debug build: False
cc @ngimel