Skip to content

Pinned memory doubles memory usage for tensors slightly over 128MB #150517

@scott306lr

Description

@scott306lr

🐛 Describe the bug

This issue appears to be related to #95823, but also occurs on smaller tensors.
Although #95823 is closed, the underlying problem still exists.

PyTorch appears to allocate memory up to the next power of two (256MB) when pinning tensors that are slightly larger than 128MB in size.
This nearly doubles the expected memory usage.

Minimal Example

import torch

def get_free():
    import subprocess
    r = subprocess.run(["free", "-m"], capture_output=True)
    d = r.stdout.decode('utf-8')
    s = d.split(':')[1].split()
    return f"[used={s[1]:7}, shared={s[3]:7}] "

model_weight = torch.randn(18944, 3584, dtype=torch.float16, device='cpu') #129.5MB (qwen2.5 7b, up_proj)
# model_weight = torch.randn(14336, 4096, dtype=torch.float16, device='cpu') #112.0MB (llama3.1 7b, up_proj)
print("weight memory usage:", model_weight.element_size() * model_weight.nelement() / (1024 ** 2), "MB")

# Pinning memory
print(get_free() + "Before pin")
model_weight = model_weight.pin_memory()
print(get_free() + "After pin")

Observed Behavior

It allocates almost double the memory when pinning qwen2.5 7b's up_proj (expected: 129.5 MB, actually used: 264 MB):

weight memory usage: 129.5 MB
[used=9306   , shared=108    ] Before pin
[used=9334   , shared=372    ] After pin

Pinning llama3.1 8b's up_proj (expected:112.0 MB, actually used: 136 MB):

weight memory usage: 112.0 MB
[used=9280   , shared=108    ] Before pin
[used=9321   , shared=244    ] After pin

Although the additional memory is less noticeable when pinning a single tensor, it can scale up and significantly inflate DRAM usage. For instance, it results in approximately 12GB of extra memory overhead when pinning weights of Qwen2.5 7b model.

Versions

PyTorch version: 2.6.0+cu126

cc @ptrblck @msaroufim @eqy

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: cudaRelated to torch.cuda, and CUDA support in generalmodule: memory usagePyTorch is using more memory than it should, or it is leaking memorytriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions