Skip to content

Add ctx manager: caching_allocator_disabled to temporarily disable CCA#177418

Closed
ColinPeppler wants to merge 8 commits intogh/ColinPeppler/6/basefrom
gh/ColinPeppler/6/head
Closed

Add ctx manager: caching_allocator_disabled to temporarily disable CCA#177418
ColinPeppler wants to merge 8 commits intogh/ColinPeppler/6/basefrom
gh/ColinPeppler/6/head

Conversation

@ColinPeppler
Copy link
Copy Markdown
Contributor

@ColinPeppler ColinPeppler commented Mar 13, 2026

Why

  • An IMA debugging aid to specifically disable CCA on a targeted block of code.
  • Another option is PYTORCH_NO_CUDA_MEMORY_CACHING=1 but that is set globally.

Usually I'd do this.

torch.cuda.caching_allocator_enable(False)
try:
    ...
finally: # make sure to clean up even on exception
    torch.cuda.caching_allocator_enable(True)

What

Add a utility that

  • Disables CUDA caching allocator (CCA) when entering the block.
  • Restores the CCA state when exiting the block (even on exceptions).
with torch.cuda.caching_allocator_disabled():
    ...

Stack from ghstack (oldest at bottom):

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @jataylo

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Mar 13, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/177418

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 47570c7 with merge base a345892 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Mar 13, 2026

This PR needs a release notes: label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

ColinPeppler added a commit that referenced this pull request Mar 13, 2026
@ColinPeppler ColinPeppler changed the title Refactor caching allocator disable/enable into context manager Add ctx manager: caching_allocator_disabled to temporarily disable CCA Mar 13, 2026
@ColinPeppler ColinPeppler requested review from eee4017 and ezyang March 13, 2026 20:13
@ColinPeppler ColinPeppler added the topic: not user facing topic category label Mar 13, 2026
… disable CCA"

Useful for IMA debugging.

Usually I do this.
```
# Disable CCA
...
# Enable CCA
```

Other option is `PYTORCH_NO_CUDA_MEMORY_CACHING=1` but sometimes I like to set it in the script.




cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo

[ghstack-poisoned]
ColinPeppler added a commit that referenced this pull request Mar 14, 2026
@ColinPeppler
Copy link
Copy Markdown
Contributor Author

Hi @eee4017, can I get your review whenever you get a chance? Thanks!

… disable CCA"

### Why
- An IMA debugging aid to specifically disable CCA on a targeted block of code.
- Another option is `PYTORCH_NO_CUDA_MEMORY_CACHING=1` but that is set globally.

Usually I'd do this.
```
torch.cuda.caching_allocator_enable(False)
try:
    ...
finally: # make sure to clean up even on exception
    torch.cuda.caching_allocator_enable(True)
```

### What
Add a utility that
- Disables CUDA caching allocator (CCA) when entering the block.
- Restores the CCA state when exiting the block (even on exceptions).
```
with torch.cuda.caching_allocator_disabled():
    ...
```




cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo

[ghstack-poisoned]
ColinPeppler added a commit that referenced this pull request Mar 18, 2026
Copy link
Copy Markdown
Collaborator

@eee4017 eee4017 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since caching_allocator_enable already exists and allocate() bakes the correct deleter (uncached_delete vs local_raw_delete) into each pointer at allocation time, tensors allocated inside the disabled region will always free correctly regardless of whether the allocator is re-enabled. This PR just add a context manager that is a straightforward save/restore wrapper.

@ColinPeppler
Copy link
Copy Markdown
Contributor Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Mar 19, 2026
@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge failed

Reason: Approvers from one of the following sets are needed:

  • superuser (pytorch/metamates)
  • Core Reviewers (mruberry, lezcano, Skylion007, ngimel, peterbell10, ...)
  • Core Maintainers (soumith, gchanan, ezyang, malfet, albanD, ...)
Details for Dev Infra team Raised by workflow job

Failing merge rule: Core Maintainers

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

ColinPeppler added a commit that referenced this pull request Apr 2, 2026
… disable CCA"

### Why
- An IMA debugging aid to specifically disable CCA on a targeted block of code.
- Another option is `PYTORCH_NO_CUDA_MEMORY_CACHING=1` but that is set globally.

Usually I'd do this.
```
torch.cuda.caching_allocator_enable(False)
try:
    ...
finally: # make sure to clean up even on exception
    torch.cuda.caching_allocator_enable(True)
```

### What
Add a utility that
- Disables CUDA caching allocator (CCA) when entering the block.
- Restores the CCA state when exiting the block (even on exceptions).
```
with torch.cuda.caching_allocator_disabled():
    ...
```




cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo

[ghstack-poisoned]
ColinPeppler added a commit that referenced this pull request Apr 3, 2026
@ColinPeppler
Copy link
Copy Markdown
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge failed

Reason: 1 mandatory check(s) failed. The first few are:

Dig deeper by viewing the failures on hud

Details for Dev Infra team Raised by workflow job

Failing merge rule: Core Maintainers

… disable CCA"

### Why
- An IMA debugging aid to specifically disable CCA on a targeted block of code.
- Another option is `PYTORCH_NO_CUDA_MEMORY_CACHING=1` but that is set globally.

Usually I'd do this.
```
torch.cuda.caching_allocator_enable(False)
try:
    ...
finally: # make sure to clean up even on exception
    torch.cuda.caching_allocator_enable(True)
```

### What
Add a utility that
- Disables CUDA caching allocator (CCA) when entering the block.
- Restores the CCA state when exiting the block (even on exceptions).
```
with torch.cuda.caching_allocator_disabled():
    ...
```




cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo

[ghstack-poisoned]
ColinPeppler added a commit that referenced this pull request Apr 6, 2026
… disable CCA"

### Why
- An IMA debugging aid to specifically disable CCA on a targeted block of code.
- Another option is `PYTORCH_NO_CUDA_MEMORY_CACHING=1` but that is set globally.

Usually I'd do this.
```
torch.cuda.caching_allocator_enable(False)
try:
    ...
finally: # make sure to clean up even on exception
    torch.cuda.caching_allocator_enable(True)
```

### What
Add a utility that
- Disables CUDA caching allocator (CCA) when entering the block.
- Restores the CCA state when exiting the block (even on exceptions).
```
with torch.cuda.caching_allocator_disabled():
    ...
```




cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo

[ghstack-poisoned]
ColinPeppler added a commit that referenced this pull request Apr 6, 2026
… disable CCA"

### Why
- An IMA debugging aid to specifically disable CCA on a targeted block of code.
- Another option is `PYTORCH_NO_CUDA_MEMORY_CACHING=1` but that is set globally.

Usually I'd do this.
```
torch.cuda.caching_allocator_enable(False)
try:
    ...
finally: # make sure to clean up even on exception
    torch.cuda.caching_allocator_enable(True)
```

### What
Add a utility that
- Disables CUDA caching allocator (CCA) when entering the block.
- Restores the CCA state when exiting the block (even on exceptions).
```
with torch.cuda.caching_allocator_disabled():
    ...
```




cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo

[ghstack-poisoned]
ColinPeppler added a commit that referenced this pull request Apr 6, 2026
@ColinPeppler
Copy link
Copy Markdown
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

nklshy-aws pushed a commit to nklshy-aws/pytorch that referenced this pull request Apr 7, 2026
pytorch#177418)

### Why
- An IMA debugging aid to specifically disable CCA on a targeted block of code.
- Another option is `PYTORCH_NO_CUDA_MEMORY_CACHING=1` but that is set globally.

Usually I'd do this.
```
torch.cuda.caching_allocator_enable(False)
try:
    ...
finally: # make sure to clean up even on exception
    torch.cuda.caching_allocator_enable(True)
```

### What
Add a utility that
- Disables CUDA caching allocator (CCA) when entering the block.
- Restores the CCA state when exiting the block (even on exceptions).
```
with torch.cuda.caching_allocator_disabled():
    ...
```

Pull Request resolved: pytorch#177418
Approved by: https://github.com/eee4017, https://github.com/laithsakka
ghstack dependencies: pytorch#177308
etaf added a commit that referenced this pull request Apr 8, 2026
…agnostic

8 AOTInductor tests fail on XPU because `caching_allocator_disabled()`
  (Intorduced by #177418) from `torch.cuda.memory` calls `torch._C._cuda_cudaCachingAllocator_is_enabled()`
which doesn't exist in XPU-only builds.

Replace the direct import of `torch.cuda.caching_allocator_disabled` with a
device-aware wrapper that delegates to the CUDA implementation on CUDA builds
and acts as a no-op on other GPU backends (XPU, etc.).


ghstack-source-id: 81004c9
Pull-Request: #179659
etaf added a commit that referenced this pull request Apr 9, 2026
…agnostic

8 AOTInductor tests fail on XPU because `caching_allocator_disabled()`
  (Intorduced by #177418) from `torch.cuda.memory` calls `torch._C._cuda_cudaCachingAllocator_is_enabled()`
which doesn't exist in XPU-only builds.

Replace the direct import of `torch.cuda.caching_allocator_disabled` with a
device-aware wrapper that delegates to the CUDA implementation on CUDA builds
and acts as a no-op on other GPU backends (XPU, etc.).

ghstack-source-id: caf3d25
Pull-Request: #179659
etaf added a commit that referenced this pull request Apr 9, 2026
…agnostic

8 AOTInductor tests fail on XPU because `caching_allocator_disabled()`
  (Intorduced by #177418) from `torch.cuda.memory` calls `torch._C._cuda_cudaCachingAllocator_is_enabled()`
which doesn't exist in XPU-only builds.

Replace the direct import of `torch.cuda.caching_allocator_disabled` with a
device-aware wrapper that delegates to the CUDA implementation on CUDA builds
and acts as a no-op on other GPU backends (XPU, etc.).

ghstack-source-id: 340ee3f
Pull-Request: #179659
etaf added a commit that referenced this pull request Apr 9, 2026
…agnostic

8 AOTInductor tests fail on XPU because `caching_allocator_disabled()`
  (Intorduced by #177418) from `torch.cuda.memory` calls `torch._C._cuda_cudaCachingAllocator_is_enabled()`
which doesn't exist in XPU-only builds.

Replace the direct import of `torch.cuda.caching_allocator_disabled` with a
device-aware wrapper that delegates to the CUDA implementation on CUDA builds
and acts as a no-op on other GPU backends (XPU, etc.).

ghstack-source-id: f567e0a
Pull-Request: #179659
etaf added a commit that referenced this pull request Apr 12, 2026
…agnostic

8 AOTInductor tests fail on XPU because `caching_allocator_disabled()`
  (Intorduced by #177418) from `torch.cuda.memory` calls `torch._C._cuda_cudaCachingAllocator_is_enabled()`
which doesn't exist in XPU-only builds.

Replace the direct import of `torch.cuda.caching_allocator_disabled` with a
device-aware wrapper that delegates to the CUDA implementation on CUDA builds
and acts as a no-op on other GPU backends (XPU, etc.).

ghstack-source-id: a0c1904
Pull-Request: #179659
etaf added a commit that referenced this pull request Apr 12, 2026
…agnostic

8 AOTInductor tests fail on XPU because `caching_allocator_disabled()`
  (Intorduced by #177418) from `torch.cuda.memory` calls `torch._C._cuda_cudaCachingAllocator_is_enabled()`
which doesn't exist in XPU-only builds.

Replace the direct import of `torch.cuda.caching_allocator_disabled` with a
device-aware wrapper that delegates to the CUDA implementation on CUDA builds
and acts as a no-op on other GPU backends (XPU, etc.).

ghstack-source-id: a0c1904
Pull-Request: #179659
etaf added a commit that referenced this pull request Apr 12, 2026
…agnostic

8 AOTInductor tests fail on XPU because `caching_allocator_disabled()`
  (Intorduced by #177418) from `torch.cuda.memory` calls `torch._C._cuda_cudaCachingAllocator_is_enabled()`
which doesn't exist in XPU-only builds.

Replace the direct import of `torch.cuda.caching_allocator_disabled` with a
device-aware wrapper that delegates to the CUDA implementation on CUDA builds
and acts as a no-op on other GPU backends (XPU, etc.).

ghstack-source-id: 5a4f221
Pull-Request: #179659
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants