Skip to content

[ROCm] Reland: Enable expandable segments (#173330)#177974

Closed
haoyuz wants to merge 1 commit intopytorch:mainfrom
haoyuz:export-D97211385
Closed

[ROCm] Reland: Enable expandable segments (#173330)#177974
haoyuz wants to merge 1 commit intopytorch:mainfrom
haoyuz:export-D97211385

Conversation

@haoyuz
Copy link
Copy Markdown
Contributor

@haoyuz haoyuz commented Mar 20, 2026

Summary:
Original pull request: #173330
Fixes #168737.
Fixes #168736.

The original diff enabled expandable segments for ROCm by adding #ifdef USE_ROCM
guards throughout CUDACachingAllocator.cpp to use HIP APIs (hipMemAddressReserve,
hipMemCreate, hipMemMap, etc.) instead of CUDA driver APIs when building for ROCm.

Root cause: In HIP/ROCm 6.2.1, the field name for memory allocation properties is
requestedHandleType (singular), not requestedHandleTypes (plural) as in CUDA.
Additionally, hipMemHandleTypeFabric does not exist in HIP, so the
CU_MEM_HANDLE_TYPE_FABRIC assignment must be skipped on ROCm.

Fix applied on top of the original diff (from D96652342):

  • Use prop.requestedHandleType = hipMemHandleTypePosixFileDescriptor under
    #ifdef USE_ROCM (singular field name, HIP constant)
  • Use prop.requestedHandleTypes = CU_MEM_HANDLE_TYPE_POSIX_FILE_DESCRIPTOR for
    CUDA (plural field name, CUDA constant)
  • Skip the CU_MEM_HANDLE_TYPE_FABRIC assignment entirely on ROCm under
    #ifndef USE_ROCM, as hipMemHandleTypeFabric does not exist in HIP

Co-authored-by: Prachi Gupta prachi.gupta@amd.com
Co-authored-by: Jeff Daily jeff.daily@amd.com
Co-authored-by: moonshadow-25 moonshadow-25@users.noreply.github.com
Co-authored-by: Vighanesh Sharma vighaneshsharma@gmail.com

Test Plan:

fbpkg build //aps_models/ads/ecosystem/eval/cogwheel_tests/amd:cogwheel_aps_ads_icvr_kd_eval_amd_test_harness --build-remote

https://www.internalfb.com/sandcastle/workflow/1049338713192153464

Differential Revision: D97211385

cc @jeffdaily @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport @jataylo @hongxiayang @naromero77amd @pragupta @jerrymannil @xinyazhang @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @kadeng @chauhang @amjames @Lucaskabela

Summary:
Original pull request: pytorch#173330
Fixes pytorch#168737.
Fixes pytorch#168736.

The original diff enabled expandable segments for ROCm by adding `#ifdef USE_ROCM`
guards throughout CUDACachingAllocator.cpp to use HIP APIs (hipMemAddressReserve,
hipMemCreate, hipMemMap, etc.) instead of CUDA driver APIs when building for ROCm.

Root cause: In HIP/ROCm 6.2.1, the field name for memory allocation properties is
`requestedHandleType` (singular), not `requestedHandleTypes` (plural) as in CUDA.
Additionally, `hipMemHandleTypeFabric` does not exist in HIP, so the
`CU_MEM_HANDLE_TYPE_FABRIC` assignment must be skipped on ROCm.

Fix applied on top of the original diff (from D96652342):
- Use `prop.requestedHandleType = hipMemHandleTypePosixFileDescriptor` under
  `#ifdef USE_ROCM` (singular field name, HIP constant)
- Use `prop.requestedHandleTypes = CU_MEM_HANDLE_TYPE_POSIX_FILE_DESCRIPTOR` for
  CUDA (plural field name, CUDA constant)
- Skip the `CU_MEM_HANDLE_TYPE_FABRIC` assignment entirely on ROCm under
  `#ifndef USE_ROCM`, as `hipMemHandleTypeFabric` does not exist in HIP

Co-authored-by: Prachi Gupta prachi.gupta@amd.com
Co-authored-by: Jeff Daily jeff.daily@amd.com
Co-authored-by: moonshadow-25 moonshadow-25@users.noreply.github.com
Co-authored-by: Vighanesh Sharma vighaneshsharma@gmail.com

Test Plan:
```
fbpkg build //aps_models/ads/ecosystem/eval/cogwheel_tests/amd:cogwheel_aps_ads_icvr_kd_eval_amd_test_harness --build-remote
```

https://www.internalfb.com/sandcastle/workflow/1049338713192153464

Differential Revision: D97211385
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Mar 20, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/177974

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (3 Unrelated Failures)

As of commit a250f65 with merge base 47ae16a (image):

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

UNSTABLE - The following jobs are marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added ci-no-td Do not run TD on this PR ciflow/inductor ciflow/rocm-mi300 Trigger "default" config CI on ROCm MI300 ciflow/torchtitan Run TorchTitan integration tests module: dynamo module: rocm AMD GPU support for Pytorch labels Mar 20, 2026
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Mar 20, 2026

This PR needs a release notes: label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

@meta-codesync
Copy link
Copy Markdown

meta-codesync bot commented Mar 20, 2026

@haoyuz has exported this pull request. If you are a Meta employee, you can view the originating Diff in D97211385.

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Mar 20, 2026
@facebook-github-tools
Copy link
Copy Markdown

@pytorchbot merge

(Initiating merge automatically since Phabricator Diff has merged)

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge failed

Reason: This PR needs a release notes: label
If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Details for Dev Infra team Raised by workflow job

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Mar 21, 2026

This PR needs a release notes: label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

@facebook-github-tools
Copy link
Copy Markdown

@pytorchbot merge

(Initiating merge automatically since Phabricator Diff has merged)

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge failed

Reason: This PR needs a release notes: label
If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Details for Dev Infra team Raised by workflow job

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Mar 21, 2026

This PR needs a release notes: label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

@facebook-github-tools
Copy link
Copy Markdown

@pytorchbot merge

(Initiating merge automatically since Phabricator Diff has merged)

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Mar 22, 2026

This PR needs a release notes: label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

@facebook-github-tools
Copy link
Copy Markdown

@pytorchbot merge

(Initiating merge automatically since Phabricator Diff has merged)

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge failed

Reason: This PR needs a release notes: label
If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Details for Dev Infra team Raised by workflow job

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Mar 22, 2026

This PR needs a release notes: label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

@facebook-github-tools
Copy link
Copy Markdown

@pytorchbot merge

(Initiating merge automatically since Phabricator Diff has merged)

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge failed

Reason: This PR needs a release notes: label
If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Details for Dev Infra team Raised by workflow job

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Mar 23, 2026

This PR needs a release notes: label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

@jeffdaily jeffdaily added the release notes: rocm mandatorylabel label Mar 23, 2026
@jeffdaily
Copy link
Copy Markdown
Collaborator

@pytorchbot merge

(Initiating merge automatically since Phabricator Diff has merged)

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

pragupta pushed a commit to pragupta/pytorch that referenced this pull request Mar 26, 2026
…77974)

Summary:
Original pull request: pytorch#173330
Fixes pytorch#168737.
Fixes pytorch#168736.

The original diff enabled expandable segments for ROCm by adding `#ifdef USE_ROCM`
guards throughout CUDACachingAllocator.cpp to use HIP APIs (hipMemAddressReserve,
hipMemCreate, hipMemMap, etc.) instead of CUDA driver APIs when building for ROCm.

Root cause: In HIP/ROCm 6.2.1, the field name for memory allocation properties is
`requestedHandleType` (singular), not `requestedHandleTypes` (plural) as in CUDA.
Additionally, `hipMemHandleTypeFabric` does not exist in HIP, so the
`CU_MEM_HANDLE_TYPE_FABRIC` assignment must be skipped on ROCm.

Fix applied on top of the original diff (from D96652342):
- Use `prop.requestedHandleType = hipMemHandleTypePosixFileDescriptor` under
  `#ifdef USE_ROCM` (singular field name, HIP constant)
- Use `prop.requestedHandleTypes = CU_MEM_HANDLE_TYPE_POSIX_FILE_DESCRIPTOR` for
  CUDA (plural field name, CUDA constant)
- Skip the `CU_MEM_HANDLE_TYPE_FABRIC` assignment entirely on ROCm under
  `#ifndef USE_ROCM`, as `hipMemHandleTypeFabric` does not exist in HIP

Co-authored-by: Prachi Gupta prachi.gupta@amd.com
Co-authored-by: Jeff Daily jeff.daily@amd.com
Co-authored-by: moonshadow-25 moonshadow-25@users.noreply.github.com
Co-authored-by: Vighanesh Sharma vighaneshsharma@gmail.com

Test Plan:
```
fbpkg build //aps_models/ads/ecosystem/eval/cogwheel_tests/amd:cogwheel_aps_ads_icvr_kd_eval_amd_test_harness --build-remote
```

https://www.internalfb.com/sandcastle/workflow/1049338713192153464

Differential Revision: D97211385

Pull Request resolved: pytorch#177974
Approved by: https://github.com/jeffdaily, https://github.com/echen4096

(cherry picked from commit 5792701)
pragupta pushed a commit to pragupta/pytorch that referenced this pull request Mar 26, 2026
…77974)

Summary:
Original pull request: pytorch#173330
Fixes pytorch#168737.
Fixes pytorch#168736.

The original diff enabled expandable segments for ROCm by adding `#ifdef USE_ROCM`
guards throughout CUDACachingAllocator.cpp to use HIP APIs (hipMemAddressReserve,
hipMemCreate, hipMemMap, etc.) instead of CUDA driver APIs when building for ROCm.

Root cause: In HIP/ROCm 6.2.1, the field name for memory allocation properties is
`requestedHandleType` (singular), not `requestedHandleTypes` (plural) as in CUDA.
Additionally, `hipMemHandleTypeFabric` does not exist in HIP, so the
`CU_MEM_HANDLE_TYPE_FABRIC` assignment must be skipped on ROCm.

Fix applied on top of the original diff (from D96652342):
- Use `prop.requestedHandleType = hipMemHandleTypePosixFileDescriptor` under
  `#ifdef USE_ROCM` (singular field name, HIP constant)
- Use `prop.requestedHandleTypes = CU_MEM_HANDLE_TYPE_POSIX_FILE_DESCRIPTOR` for
  CUDA (plural field name, CUDA constant)
- Skip the `CU_MEM_HANDLE_TYPE_FABRIC` assignment entirely on ROCm under
  `#ifndef USE_ROCM`, as `hipMemHandleTypeFabric` does not exist in HIP

Co-authored-by: Prachi Gupta prachi.gupta@amd.com
Co-authored-by: Jeff Daily jeff.daily@amd.com
Co-authored-by: moonshadow-25 moonshadow-25@users.noreply.github.com
Co-authored-by: Vighanesh Sharma vighaneshsharma@gmail.com

Test Plan:
```
fbpkg build //aps_models/ads/ecosystem/eval/cogwheel_tests/amd:cogwheel_aps_ads_icvr_kd_eval_amd_test_harness --build-remote
```

https://www.internalfb.com/sandcastle/workflow/1049338713192153464

Differential Revision: D97211385

Pull Request resolved: pytorch#177974
Approved by: https://github.com/jeffdaily, https://github.com/echen4096

(cherry picked from commit 5792701)
pragupta added a commit to ROCm/pytorch that referenced this pull request Mar 27, 2026
…77974) (#3106)

Summary:
Original pull request: pytorch#173330
Fixes pytorch#168737. Fixes
pytorch#168736.

The original diff enabled expandable segments for ROCm by adding `#ifdef
USE_ROCM` guards throughout CUDACachingAllocator.cpp to use HIP APIs
(hipMemAddressReserve, hipMemCreate, hipMemMap, etc.) instead of CUDA
driver APIs when building for ROCm.

Root cause: In HIP/ROCm 6.2.1, the field name for memory allocation
properties is `requestedHandleType` (singular), not
`requestedHandleTypes` (plural) as in CUDA. Additionally,
`hipMemHandleTypeFabric` does not exist in HIP, so the
`CU_MEM_HANDLE_TYPE_FABRIC` assignment must be skipped on ROCm.

Fix applied on top of the original diff (from D96652342):
- Use `prop.requestedHandleType = hipMemHandleTypePosixFileDescriptor`
under `#ifdef USE_ROCM` (singular field name, HIP constant)
- Use `prop.requestedHandleTypes =
CU_MEM_HANDLE_TYPE_POSIX_FILE_DESCRIPTOR` for CUDA (plural field name,
CUDA constant)
- Skip the `CU_MEM_HANDLE_TYPE_FABRIC` assignment entirely on ROCm under
`#ifndef USE_ROCM`, as `hipMemHandleTypeFabric` does not exist in HIP

Co-authored-by: Prachi Gupta prachi.gupta@amd.com
Co-authored-by: Jeff Daily jeff.daily@amd.com
Co-authored-by: moonshadow-25 moonshadow-25@users.noreply.github.com
Co-authored-by: Vighanesh Sharma vighaneshsharma@gmail.com

Test Plan:
```
fbpkg build //aps_models/ads/ecosystem/eval/cogwheel_tests/amd:cogwheel_aps_ads_icvr_kd_eval_amd_test_harness --build-remote
```

https://www.internalfb.com/sandcastle/workflow/1049338713192153464

Differential Revision: D97211385

Pull Request resolved: pytorch#177974
Approved by: https://github.com/jeffdaily, https://github.com/echen4096

(cherry picked from commit 5792701)

## Motivation

<!-- Explain the purpose of this PR and the goals it aims to achieve.
-->

## Technical Details

<!-- Explain the changes along with any relevant GitHub links. -->

## Test Plan

<!-- Explain any relevant testing done to verify this PR. -->

## Test Result

<!-- Briefly summarize test outcomes. -->

## Submission Checklist

- [ ] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

Co-authored-by: Haoyu Zhang <haoyuz@meta.com>
Alkaid-Benetnash pushed a commit to Alkaid-Benetnash/pytorch that referenced this pull request Mar 28, 2026
…77974)

Summary:
Original pull request: pytorch#173330
Fixes pytorch#168737.
Fixes pytorch#168736.

The original diff enabled expandable segments for ROCm by adding `#ifdef USE_ROCM`
guards throughout CUDACachingAllocator.cpp to use HIP APIs (hipMemAddressReserve,
hipMemCreate, hipMemMap, etc.) instead of CUDA driver APIs when building for ROCm.

Root cause: In HIP/ROCm 6.2.1, the field name for memory allocation properties is
`requestedHandleType` (singular), not `requestedHandleTypes` (plural) as in CUDA.
Additionally, `hipMemHandleTypeFabric` does not exist in HIP, so the
`CU_MEM_HANDLE_TYPE_FABRIC` assignment must be skipped on ROCm.

Fix applied on top of the original diff (from D96652342):
- Use `prop.requestedHandleType = hipMemHandleTypePosixFileDescriptor` under
  `#ifdef USE_ROCM` (singular field name, HIP constant)
- Use `prop.requestedHandleTypes = CU_MEM_HANDLE_TYPE_POSIX_FILE_DESCRIPTOR` for
  CUDA (plural field name, CUDA constant)
- Skip the `CU_MEM_HANDLE_TYPE_FABRIC` assignment entirely on ROCm under
  `#ifndef USE_ROCM`, as `hipMemHandleTypeFabric` does not exist in HIP

Co-authored-by: Prachi Gupta prachi.gupta@amd.com
Co-authored-by: Jeff Daily jeff.daily@amd.com
Co-authored-by: moonshadow-25 moonshadow-25@users.noreply.github.com
Co-authored-by: Vighanesh Sharma vighaneshsharma@gmail.com

Test Plan:
```
fbpkg build //aps_models/ads/ecosystem/eval/cogwheel_tests/amd:cogwheel_aps_ads_icvr_kd_eval_amd_test_harness --build-remote
```

https://www.internalfb.com/sandcastle/workflow/1049338713192153464

Differential Revision: D97211385

Pull Request resolved: pytorch#177974
Approved by: https://github.com/jeffdaily, https://github.com/echen4096
Alkaid-Benetnash pushed a commit to Alkaid-Benetnash/pytorch that referenced this pull request Mar 28, 2026
…77974)

Summary:
Original pull request: pytorch#173330
Fixes pytorch#168737.
Fixes pytorch#168736.

The original diff enabled expandable segments for ROCm by adding `#ifdef USE_ROCM`
guards throughout CUDACachingAllocator.cpp to use HIP APIs (hipMemAddressReserve,
hipMemCreate, hipMemMap, etc.) instead of CUDA driver APIs when building for ROCm.

Root cause: In HIP/ROCm 6.2.1, the field name for memory allocation properties is
`requestedHandleType` (singular), not `requestedHandleTypes` (plural) as in CUDA.
Additionally, `hipMemHandleTypeFabric` does not exist in HIP, so the
`CU_MEM_HANDLE_TYPE_FABRIC` assignment must be skipped on ROCm.

Fix applied on top of the original diff (from D96652342):
- Use `prop.requestedHandleType = hipMemHandleTypePosixFileDescriptor` under
  `#ifdef USE_ROCM` (singular field name, HIP constant)
- Use `prop.requestedHandleTypes = CU_MEM_HANDLE_TYPE_POSIX_FILE_DESCRIPTOR` for
  CUDA (plural field name, CUDA constant)
- Skip the `CU_MEM_HANDLE_TYPE_FABRIC` assignment entirely on ROCm under
  `#ifndef USE_ROCM`, as `hipMemHandleTypeFabric` does not exist in HIP

Co-authored-by: Prachi Gupta prachi.gupta@amd.com
Co-authored-by: Jeff Daily jeff.daily@amd.com
Co-authored-by: moonshadow-25 moonshadow-25@users.noreply.github.com
Co-authored-by: Vighanesh Sharma vighaneshsharma@gmail.com

Test Plan:
```
fbpkg build //aps_models/ads/ecosystem/eval/cogwheel_tests/amd:cogwheel_aps_ads_icvr_kd_eval_amd_test_harness --build-remote
```

https://www.internalfb.com/sandcastle/workflow/1049338713192153464

Differential Revision: D97211385

Pull Request resolved: pytorch#177974
Approved by: https://github.com/jeffdaily, https://github.com/echen4096
AaronWang04 pushed a commit to AaronWang04/pytorch that referenced this pull request Mar 31, 2026
…77974)

Summary:
Original pull request: pytorch#173330
Fixes pytorch#168737.
Fixes pytorch#168736.

The original diff enabled expandable segments for ROCm by adding `#ifdef USE_ROCM`
guards throughout CUDACachingAllocator.cpp to use HIP APIs (hipMemAddressReserve,
hipMemCreate, hipMemMap, etc.) instead of CUDA driver APIs when building for ROCm.

Root cause: In HIP/ROCm 6.2.1, the field name for memory allocation properties is
`requestedHandleType` (singular), not `requestedHandleTypes` (plural) as in CUDA.
Additionally, `hipMemHandleTypeFabric` does not exist in HIP, so the
`CU_MEM_HANDLE_TYPE_FABRIC` assignment must be skipped on ROCm.

Fix applied on top of the original diff (from D96652342):
- Use `prop.requestedHandleType = hipMemHandleTypePosixFileDescriptor` under
  `#ifdef USE_ROCM` (singular field name, HIP constant)
- Use `prop.requestedHandleTypes = CU_MEM_HANDLE_TYPE_POSIX_FILE_DESCRIPTOR` for
  CUDA (plural field name, CUDA constant)
- Skip the `CU_MEM_HANDLE_TYPE_FABRIC` assignment entirely on ROCm under
  `#ifndef USE_ROCM`, as `hipMemHandleTypeFabric` does not exist in HIP

Co-authored-by: Prachi Gupta prachi.gupta@amd.com
Co-authored-by: Jeff Daily jeff.daily@amd.com
Co-authored-by: moonshadow-25 moonshadow-25@users.noreply.github.com
Co-authored-by: Vighanesh Sharma vighaneshsharma@gmail.com

Test Plan:
```
fbpkg build //aps_models/ads/ecosystem/eval/cogwheel_tests/amd:cogwheel_aps_ads_icvr_kd_eval_amd_test_harness --build-remote
```

https://www.internalfb.com/sandcastle/workflow/1049338713192153464

Differential Revision: D97211385

Pull Request resolved: pytorch#177974
Approved by: https://github.com/jeffdaily, https://github.com/echen4096
nklshy-aws pushed a commit to nklshy-aws/pytorch that referenced this pull request Apr 7, 2026
…77974)

Summary:
Original pull request: pytorch#173330
Fixes pytorch#168737.
Fixes pytorch#168736.

The original diff enabled expandable segments for ROCm by adding `#ifdef USE_ROCM`
guards throughout CUDACachingAllocator.cpp to use HIP APIs (hipMemAddressReserve,
hipMemCreate, hipMemMap, etc.) instead of CUDA driver APIs when building for ROCm.

Root cause: In HIP/ROCm 6.2.1, the field name for memory allocation properties is
`requestedHandleType` (singular), not `requestedHandleTypes` (plural) as in CUDA.
Additionally, `hipMemHandleTypeFabric` does not exist in HIP, so the
`CU_MEM_HANDLE_TYPE_FABRIC` assignment must be skipped on ROCm.

Fix applied on top of the original diff (from D96652342):
- Use `prop.requestedHandleType = hipMemHandleTypePosixFileDescriptor` under
  `#ifdef USE_ROCM` (singular field name, HIP constant)
- Use `prop.requestedHandleTypes = CU_MEM_HANDLE_TYPE_POSIX_FILE_DESCRIPTOR` for
  CUDA (plural field name, CUDA constant)
- Skip the `CU_MEM_HANDLE_TYPE_FABRIC` assignment entirely on ROCm under
  `#ifndef USE_ROCM`, as `hipMemHandleTypeFabric` does not exist in HIP

Co-authored-by: Prachi Gupta prachi.gupta@amd.com
Co-authored-by: Jeff Daily jeff.daily@amd.com
Co-authored-by: moonshadow-25 moonshadow-25@users.noreply.github.com
Co-authored-by: Vighanesh Sharma vighaneshsharma@gmail.com

Test Plan:
```
fbpkg build //aps_models/ads/ecosystem/eval/cogwheel_tests/amd:cogwheel_aps_ads_icvr_kd_eval_amd_test_harness --build-remote
```

https://www.internalfb.com/sandcastle/workflow/1049338713192153464

Differential Revision: D97211385

Pull Request resolved: pytorch#177974
Approved by: https://github.com/jeffdaily, https://github.com/echen4096
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-no-td Do not run TD on this PR ciflow/inductor ciflow/rocm-mi300 Trigger "default" config CI on ROCm MI300 ciflow/torchtitan Run TorchTitan integration tests ciflow/trunk Trigger trunk jobs on your pull request fb-exported Merged meta-exported module: dynamo module: rocm AMD GPU support for Pytorch release notes: rocm mandatorylabel

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Test: TestMemPool.test_mempool_expandable TestModule: test/test_cuda.py::TestMemPool

4 participants