Skip to content

[ROCm] Enable expandable segments#173330

Closed
pragupta wants to merge 12 commits intomainfrom
rocm_expandable_segments
Closed

[ROCm] Enable expandable segments#173330
pragupta wants to merge 12 commits intomainfrom
rocm_expandable_segments

Conversation

@pragupta
Copy link
Copy Markdown
Collaborator

@pragupta pragupta commented Jan 25, 2026

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Jan 25, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/173330

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 13 Unrelated Failures

As of commit 7a4a90a with merge base 8be2451 (image):

NEW FAILURE - The following job has failed:

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

UNSTABLE - The following jobs are marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added ciflow/rocm-mi300 Trigger "default" config CI on ROCm MI300 module: rocm AMD GPU support for Pytorch labels Jan 25, 2026
@pragupta pragupta added the ciflow/trunk Trigger trunk jobs on your pull request label Jan 25, 2026
@github-actions
Copy link
Copy Markdown
Contributor

This PR needs a release notes: label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

@linux-foundation-easycla
Copy link
Copy Markdown

linux-foundation-easycla bot commented Jan 26, 2026

CLA Signed

The committers listed above are authorized under a signed CLA.

@pragupta pragupta force-pushed the rocm_expandable_segments branch from 3b3ffe3 to e4e0d36 Compare January 26, 2026 12:09
@bdhirsh bdhirsh added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jan 26, 2026
@pragupta pragupta marked this pull request as draft January 27, 2026 15:20
@jeffdaily jeffdaily added the release notes: rocm mandatorylabel label Feb 3, 2026
@jeffdaily
Copy link
Copy Markdown
Collaborator

We have found that for unit tests to fully pass we need this HIP patch ROCm/rocm-systems#3023.

@jeffdaily
Copy link
Copy Markdown
Collaborator

@pytorchbot rebase

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Successfully rebased rocm_expandable_segments onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout rocm_expandable_segments && git pull --rebase)

@pytorchmergebot pytorchmergebot force-pushed the rocm_expandable_segments branch from e4e0d36 to d6fed57 Compare February 3, 2026 22:26
@jeffdaily jeffdaily marked this pull request as ready for review February 11, 2026 16:38
@jeffdaily
Copy link
Copy Markdown
Collaborator

@pytorchbot rebase

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Successfully rebased rocm_expandable_segments onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout rocm_expandable_segments && git pull --rebase)

@pytorchmergebot pytorchmergebot force-pushed the rocm_expandable_segments branch from 78869d4 to 922a0d4 Compare February 20, 2026 02:59
@jeffdaily
Copy link
Copy Markdown
Collaborator

@pytorchbot merge

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Feb 25, 2026

This PR needs to be approved by an authorized maintainer before merge.

@jeffdaily
Copy link
Copy Markdown
Collaborator

@pytorchbot merge -f "need to use force merge due to unrelated blocking failure, all other flaky CI is known; reason for revert has been addressed"

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@yangw-dev
Copy link
Copy Markdown
Contributor

@pytorchbot revert -m "reverted internally, original:D96556656, revert diff: D96725665" -c ghfirst

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

pytorchmergebot added a commit that referenced this pull request Mar 19, 2026
This reverts commit 088c5a7.

Reverted #173330 on behalf of https://github.com/yangw-dev due to reverted internally, original:D96556656, revert diff: D96725665 ([comment](#173330 (comment)))
@pytorchmergebot
Copy link
Copy Markdown
Collaborator

@pragupta your PR has been successfully reverted.

@huydhn
Copy link
Copy Markdown
Contributor

huydhn commented Mar 19, 2026

Let me import to reland this

@meta-codesync
Copy link
Copy Markdown

meta-codesync bot commented Mar 19, 2026

@huydhn has imported this pull request. If you are a Meta employee, you can view this in D97339294.

darren-amd added a commit to ROCm/TheRock that referenced this pull request Mar 19, 2026
## Motivation

Fixes #3962

- The `rocprofiler-sdk` shared library is not being preloaded, causing
`librocprofiler-sdk.so.1` to be missing at runtime. This is because the
PyTorch `kineto` submodule was bumped which switched from `roctracer` to
`rocprofiler-sdk`: pytorch/pytorch#177101
- `test_mempool_expandable` was enabled on ROCm by
pytorch/pytorch#173330. This test was failing as
it requires the rocm[devel] packages but was causing a crash:
https://github.com/ROCm/TheRock/actions/runs/23164829934/job/67321547840.
This test is currently already skipped for other torch versions.
- Also skip `test_mempool_empty_cache_inactive`,
`test_mempool_limited_memory_with_allocator`,
`test_deleted_mempool_not_used_on_oom`, and
`test_mempool_ctx_multithread` as these also require building
`dummy_allocator` and are skipped in other torch versions.

## Technical Details

- Adds `rocprofiler-sdk` to `LINUX_LIBRARY_PRELOADS` in
`build_prod_wheels.py` so that `librocprofiler-sdk.so` is loaded
- Registers `rocprofiler-sdk` as a `LibraryEntry` in `_dist_info.py` so
the `rocm_sdk` package can resolve the name to the actual `.so` file.

## Test Plan

- Verify that ROCm builds, the nightly smoke tests pass and that running
the torch tests do not crash

## Test Result

- ROCm builds successfully:
https://github.com/ROCm/TheRock/actions/runs/23152017500
- Smoke tests pass for torch nightly and the runner is not crashing:
https://github.com/ROCm/TheRock/actions/runs/23253453219

## Submission Checklist

- [x] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
pytorch-bot bot pushed a commit that referenced this pull request Mar 20, 2026
Summary:
Original pull request: #173330
Fixes #168737.
Fixes #168736.

The original diff enabled expandable segments for ROCm by adding `#ifdef USE_ROCM`
guards throughout CUDACachingAllocator.cpp to use HIP APIs (hipMemAddressReserve,
hipMemCreate, hipMemMap, etc.) instead of CUDA driver APIs when building for ROCm.

Root cause: In HIP/ROCm 6.2.1, the field name for memory allocation properties is
`requestedHandleType` (singular), not `requestedHandleTypes` (plural) as in CUDA.
Additionally, `hipMemHandleTypeFabric` does not exist in HIP, so the
`CU_MEM_HANDLE_TYPE_FABRIC` assignment must be skipped on ROCm.

Fix applied on top of the original diff (from D96652342):
- Use `prop.requestedHandleType = hipMemHandleTypePosixFileDescriptor` under
  `#ifdef USE_ROCM` (singular field name, HIP constant)
- Use `prop.requestedHandleTypes = CU_MEM_HANDLE_TYPE_POSIX_FILE_DESCRIPTOR` for
  CUDA (plural field name, CUDA constant)
- Skip the `CU_MEM_HANDLE_TYPE_FABRIC` assignment entirely on ROCm under
  `#ifndef USE_ROCM`, as `hipMemHandleTypeFabric` does not exist in HIP

Co-authored-by: Prachi Gupta prachi.gupta@amd.com
Co-authored-by: Jeff Daily jeff.daily@amd.com
Co-authored-by: moonshadow-25 moonshadow-25@users.noreply.github.com
Co-authored-by: Vighanesh Sharma vighaneshsharma@gmail.com

Test Plan:
```
fbpkg build //aps_models/ads/ecosystem/eval/cogwheel_tests/amd:cogwheel_aps_ads_icvr_kd_eval_amd_test_harness --build-remote
```

https://www.internalfb.com/sandcastle/workflow/1049338713192153464

Differential Revision: D97211385
pytorchmergebot pushed a commit that referenced this pull request Mar 23, 2026
Summary:
Original pull request: #173330
Fixes #168737.
Fixes #168736.

The original diff enabled expandable segments for ROCm by adding `#ifdef USE_ROCM`
guards throughout CUDACachingAllocator.cpp to use HIP APIs (hipMemAddressReserve,
hipMemCreate, hipMemMap, etc.) instead of CUDA driver APIs when building for ROCm.

Root cause: In HIP/ROCm 6.2.1, the field name for memory allocation properties is
`requestedHandleType` (singular), not `requestedHandleTypes` (plural) as in CUDA.
Additionally, `hipMemHandleTypeFabric` does not exist in HIP, so the
`CU_MEM_HANDLE_TYPE_FABRIC` assignment must be skipped on ROCm.

Fix applied on top of the original diff (from D96652342):
- Use `prop.requestedHandleType = hipMemHandleTypePosixFileDescriptor` under
  `#ifdef USE_ROCM` (singular field name, HIP constant)
- Use `prop.requestedHandleTypes = CU_MEM_HANDLE_TYPE_POSIX_FILE_DESCRIPTOR` for
  CUDA (plural field name, CUDA constant)
- Skip the `CU_MEM_HANDLE_TYPE_FABRIC` assignment entirely on ROCm under
  `#ifndef USE_ROCM`, as `hipMemHandleTypeFabric` does not exist in HIP

Co-authored-by: Prachi Gupta prachi.gupta@amd.com
Co-authored-by: Jeff Daily jeff.daily@amd.com
Co-authored-by: moonshadow-25 moonshadow-25@users.noreply.github.com
Co-authored-by: Vighanesh Sharma vighaneshsharma@gmail.com

Test Plan:
```
fbpkg build //aps_models/ads/ecosystem/eval/cogwheel_tests/amd:cogwheel_aps_ads_icvr_kd_eval_amd_test_harness --build-remote
```

https://www.internalfb.com/sandcastle/workflow/1049338713192153464

Differential Revision: D97211385

Pull Request resolved: #177974
Approved by: https://github.com/jeffdaily, https://github.com/echen4096
chiranjeevipattigidi pushed a commit to ROCm/TheRock that referenced this pull request Mar 23, 2026
## Motivation

Fixes #3962

- The `rocprofiler-sdk` shared library is not being preloaded, causing
`librocprofiler-sdk.so.1` to be missing at runtime. This is because the
PyTorch `kineto` submodule was bumped which switched from `roctracer` to
`rocprofiler-sdk`: pytorch/pytorch#177101
- `test_mempool_expandable` was enabled on ROCm by
pytorch/pytorch#173330. This test was failing as
it requires the rocm[devel] packages but was causing a crash:
https://github.com/ROCm/TheRock/actions/runs/23164829934/job/67321547840.
This test is currently already skipped for other torch versions.
- Also skip `test_mempool_empty_cache_inactive`,
`test_mempool_limited_memory_with_allocator`,
`test_deleted_mempool_not_used_on_oom`, and
`test_mempool_ctx_multithread` as these also require building
`dummy_allocator` and are skipped in other torch versions.

## Technical Details

- Adds `rocprofiler-sdk` to `LINUX_LIBRARY_PRELOADS` in
`build_prod_wheels.py` so that `librocprofiler-sdk.so` is loaded
- Registers `rocprofiler-sdk` as a `LibraryEntry` in `_dist_info.py` so
the `rocm_sdk` package can resolve the name to the actual `.so` file.

## Test Plan

- Verify that ROCm builds, the nightly smoke tests pass and that running
the torch tests do not crash

## Test Result

- ROCm builds successfully:
https://github.com/ROCm/TheRock/actions/runs/23152017500
- Smoke tests pass for torch nightly and the runner is not crashing:
https://github.com/ROCm/TheRock/actions/runs/23253453219

## Submission Checklist

- [x] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
@pragupta
Copy link
Copy Markdown
Collaborator Author

closing this one as relanded here: #177974

@pragupta pragupta closed this Mar 24, 2026
pragupta pushed a commit to pragupta/pytorch that referenced this pull request Mar 26, 2026
…77974)

Summary:
Original pull request: pytorch#173330
Fixes pytorch#168737.
Fixes pytorch#168736.

The original diff enabled expandable segments for ROCm by adding `#ifdef USE_ROCM`
guards throughout CUDACachingAllocator.cpp to use HIP APIs (hipMemAddressReserve,
hipMemCreate, hipMemMap, etc.) instead of CUDA driver APIs when building for ROCm.

Root cause: In HIP/ROCm 6.2.1, the field name for memory allocation properties is
`requestedHandleType` (singular), not `requestedHandleTypes` (plural) as in CUDA.
Additionally, `hipMemHandleTypeFabric` does not exist in HIP, so the
`CU_MEM_HANDLE_TYPE_FABRIC` assignment must be skipped on ROCm.

Fix applied on top of the original diff (from D96652342):
- Use `prop.requestedHandleType = hipMemHandleTypePosixFileDescriptor` under
  `#ifdef USE_ROCM` (singular field name, HIP constant)
- Use `prop.requestedHandleTypes = CU_MEM_HANDLE_TYPE_POSIX_FILE_DESCRIPTOR` for
  CUDA (plural field name, CUDA constant)
- Skip the `CU_MEM_HANDLE_TYPE_FABRIC` assignment entirely on ROCm under
  `#ifndef USE_ROCM`, as `hipMemHandleTypeFabric` does not exist in HIP

Co-authored-by: Prachi Gupta prachi.gupta@amd.com
Co-authored-by: Jeff Daily jeff.daily@amd.com
Co-authored-by: moonshadow-25 moonshadow-25@users.noreply.github.com
Co-authored-by: Vighanesh Sharma vighaneshsharma@gmail.com

Test Plan:
```
fbpkg build //aps_models/ads/ecosystem/eval/cogwheel_tests/amd:cogwheel_aps_ads_icvr_kd_eval_amd_test_harness --build-remote
```

https://www.internalfb.com/sandcastle/workflow/1049338713192153464

Differential Revision: D97211385

Pull Request resolved: pytorch#177974
Approved by: https://github.com/jeffdaily, https://github.com/echen4096

(cherry picked from commit 5792701)
pragupta pushed a commit to pragupta/pytorch that referenced this pull request Mar 26, 2026
…77974)

Summary:
Original pull request: pytorch#173330
Fixes pytorch#168737.
Fixes pytorch#168736.

The original diff enabled expandable segments for ROCm by adding `#ifdef USE_ROCM`
guards throughout CUDACachingAllocator.cpp to use HIP APIs (hipMemAddressReserve,
hipMemCreate, hipMemMap, etc.) instead of CUDA driver APIs when building for ROCm.

Root cause: In HIP/ROCm 6.2.1, the field name for memory allocation properties is
`requestedHandleType` (singular), not `requestedHandleTypes` (plural) as in CUDA.
Additionally, `hipMemHandleTypeFabric` does not exist in HIP, so the
`CU_MEM_HANDLE_TYPE_FABRIC` assignment must be skipped on ROCm.

Fix applied on top of the original diff (from D96652342):
- Use `prop.requestedHandleType = hipMemHandleTypePosixFileDescriptor` under
  `#ifdef USE_ROCM` (singular field name, HIP constant)
- Use `prop.requestedHandleTypes = CU_MEM_HANDLE_TYPE_POSIX_FILE_DESCRIPTOR` for
  CUDA (plural field name, CUDA constant)
- Skip the `CU_MEM_HANDLE_TYPE_FABRIC` assignment entirely on ROCm under
  `#ifndef USE_ROCM`, as `hipMemHandleTypeFabric` does not exist in HIP

Co-authored-by: Prachi Gupta prachi.gupta@amd.com
Co-authored-by: Jeff Daily jeff.daily@amd.com
Co-authored-by: moonshadow-25 moonshadow-25@users.noreply.github.com
Co-authored-by: Vighanesh Sharma vighaneshsharma@gmail.com

Test Plan:
```
fbpkg build //aps_models/ads/ecosystem/eval/cogwheel_tests/amd:cogwheel_aps_ads_icvr_kd_eval_amd_test_harness --build-remote
```

https://www.internalfb.com/sandcastle/workflow/1049338713192153464

Differential Revision: D97211385

Pull Request resolved: pytorch#177974
Approved by: https://github.com/jeffdaily, https://github.com/echen4096

(cherry picked from commit 5792701)
pragupta added a commit to ROCm/pytorch that referenced this pull request Mar 27, 2026
…77974) (#3106)

Summary:
Original pull request: pytorch#173330
Fixes pytorch#168737. Fixes
pytorch#168736.

The original diff enabled expandable segments for ROCm by adding `#ifdef
USE_ROCM` guards throughout CUDACachingAllocator.cpp to use HIP APIs
(hipMemAddressReserve, hipMemCreate, hipMemMap, etc.) instead of CUDA
driver APIs when building for ROCm.

Root cause: In HIP/ROCm 6.2.1, the field name for memory allocation
properties is `requestedHandleType` (singular), not
`requestedHandleTypes` (plural) as in CUDA. Additionally,
`hipMemHandleTypeFabric` does not exist in HIP, so the
`CU_MEM_HANDLE_TYPE_FABRIC` assignment must be skipped on ROCm.

Fix applied on top of the original diff (from D96652342):
- Use `prop.requestedHandleType = hipMemHandleTypePosixFileDescriptor`
under `#ifdef USE_ROCM` (singular field name, HIP constant)
- Use `prop.requestedHandleTypes =
CU_MEM_HANDLE_TYPE_POSIX_FILE_DESCRIPTOR` for CUDA (plural field name,
CUDA constant)
- Skip the `CU_MEM_HANDLE_TYPE_FABRIC` assignment entirely on ROCm under
`#ifndef USE_ROCM`, as `hipMemHandleTypeFabric` does not exist in HIP

Co-authored-by: Prachi Gupta prachi.gupta@amd.com
Co-authored-by: Jeff Daily jeff.daily@amd.com
Co-authored-by: moonshadow-25 moonshadow-25@users.noreply.github.com
Co-authored-by: Vighanesh Sharma vighaneshsharma@gmail.com

Test Plan:
```
fbpkg build //aps_models/ads/ecosystem/eval/cogwheel_tests/amd:cogwheel_aps_ads_icvr_kd_eval_amd_test_harness --build-remote
```

https://www.internalfb.com/sandcastle/workflow/1049338713192153464

Differential Revision: D97211385

Pull Request resolved: pytorch#177974
Approved by: https://github.com/jeffdaily, https://github.com/echen4096

(cherry picked from commit 5792701)

## Motivation

<!-- Explain the purpose of this PR and the goals it aims to achieve.
-->

## Technical Details

<!-- Explain the changes along with any relevant GitHub links. -->

## Test Plan

<!-- Explain any relevant testing done to verify this PR. -->

## Test Result

<!-- Briefly summarize test outcomes. -->

## Submission Checklist

- [ ] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

Co-authored-by: Haoyu Zhang <haoyuz@meta.com>
Alkaid-Benetnash pushed a commit to Alkaid-Benetnash/pytorch that referenced this pull request Mar 28, 2026
…77974)

Summary:
Original pull request: pytorch#173330
Fixes pytorch#168737.
Fixes pytorch#168736.

The original diff enabled expandable segments for ROCm by adding `#ifdef USE_ROCM`
guards throughout CUDACachingAllocator.cpp to use HIP APIs (hipMemAddressReserve,
hipMemCreate, hipMemMap, etc.) instead of CUDA driver APIs when building for ROCm.

Root cause: In HIP/ROCm 6.2.1, the field name for memory allocation properties is
`requestedHandleType` (singular), not `requestedHandleTypes` (plural) as in CUDA.
Additionally, `hipMemHandleTypeFabric` does not exist in HIP, so the
`CU_MEM_HANDLE_TYPE_FABRIC` assignment must be skipped on ROCm.

Fix applied on top of the original diff (from D96652342):
- Use `prop.requestedHandleType = hipMemHandleTypePosixFileDescriptor` under
  `#ifdef USE_ROCM` (singular field name, HIP constant)
- Use `prop.requestedHandleTypes = CU_MEM_HANDLE_TYPE_POSIX_FILE_DESCRIPTOR` for
  CUDA (plural field name, CUDA constant)
- Skip the `CU_MEM_HANDLE_TYPE_FABRIC` assignment entirely on ROCm under
  `#ifndef USE_ROCM`, as `hipMemHandleTypeFabric` does not exist in HIP

Co-authored-by: Prachi Gupta prachi.gupta@amd.com
Co-authored-by: Jeff Daily jeff.daily@amd.com
Co-authored-by: moonshadow-25 moonshadow-25@users.noreply.github.com
Co-authored-by: Vighanesh Sharma vighaneshsharma@gmail.com

Test Plan:
```
fbpkg build //aps_models/ads/ecosystem/eval/cogwheel_tests/amd:cogwheel_aps_ads_icvr_kd_eval_amd_test_harness --build-remote
```

https://www.internalfb.com/sandcastle/workflow/1049338713192153464

Differential Revision: D97211385

Pull Request resolved: pytorch#177974
Approved by: https://github.com/jeffdaily, https://github.com/echen4096
Alkaid-Benetnash pushed a commit to Alkaid-Benetnash/pytorch that referenced this pull request Mar 28, 2026
…77974)

Summary:
Original pull request: pytorch#173330
Fixes pytorch#168737.
Fixes pytorch#168736.

The original diff enabled expandable segments for ROCm by adding `#ifdef USE_ROCM`
guards throughout CUDACachingAllocator.cpp to use HIP APIs (hipMemAddressReserve,
hipMemCreate, hipMemMap, etc.) instead of CUDA driver APIs when building for ROCm.

Root cause: In HIP/ROCm 6.2.1, the field name for memory allocation properties is
`requestedHandleType` (singular), not `requestedHandleTypes` (plural) as in CUDA.
Additionally, `hipMemHandleTypeFabric` does not exist in HIP, so the
`CU_MEM_HANDLE_TYPE_FABRIC` assignment must be skipped on ROCm.

Fix applied on top of the original diff (from D96652342):
- Use `prop.requestedHandleType = hipMemHandleTypePosixFileDescriptor` under
  `#ifdef USE_ROCM` (singular field name, HIP constant)
- Use `prop.requestedHandleTypes = CU_MEM_HANDLE_TYPE_POSIX_FILE_DESCRIPTOR` for
  CUDA (plural field name, CUDA constant)
- Skip the `CU_MEM_HANDLE_TYPE_FABRIC` assignment entirely on ROCm under
  `#ifndef USE_ROCM`, as `hipMemHandleTypeFabric` does not exist in HIP

Co-authored-by: Prachi Gupta prachi.gupta@amd.com
Co-authored-by: Jeff Daily jeff.daily@amd.com
Co-authored-by: moonshadow-25 moonshadow-25@users.noreply.github.com
Co-authored-by: Vighanesh Sharma vighaneshsharma@gmail.com

Test Plan:
```
fbpkg build //aps_models/ads/ecosystem/eval/cogwheel_tests/amd:cogwheel_aps_ads_icvr_kd_eval_amd_test_harness --build-remote
```

https://www.internalfb.com/sandcastle/workflow/1049338713192153464

Differential Revision: D97211385

Pull Request resolved: pytorch#177974
Approved by: https://github.com/jeffdaily, https://github.com/echen4096
EmanueleCoradin pushed a commit to EmanueleCoradin/pytorch that referenced this pull request Mar 30, 2026
Pull Request resolved: pytorch#173330
Approved by: https://github.com/jeffdaily

Co-authored-by: Jeff Daily <jeff.daily@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-no-td Do not run TD on this PR ciflow/inductor ciflow/rocm-mi300 Trigger "default" config CI on ROCm MI300 ciflow/torchtitan Run TorchTitan integration tests ciflow/trunk Trigger trunk jobs on your pull request Merged module: dynamo module: rocm AMD GPU support for Pytorch open source release notes: rocm mandatorylabel Reverted triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Test: TestMemPool.test_mempool_expandable TestModule: test/test_cuda.py::TestMemPool