[c10] Move P2P access logic from ATen to c10#174582
[c10] Move P2P access logic from ATen to c10#174582ngimel wants to merge 2 commits intogh/ngimel/1/basefrom
Conversation
Refactor PeerToPeerAccess by moving the core implementation from aten/src/ATen/cuda to c10/cuda. This makes P2P and fabric access queries available at the c10 layer without requiring ATen dependencies. The ATen layer now provides thin wrappers that ensure CUDA lazy initialization before forwarding to c10. This separation allows lower-level CUDA code to query P2P capabilities without pulling in ATen context machinery. Differential Revision: [D92675476](https://our.internmc.facebook.com/intern/diff/D92675476/) [ghstack-poisoned]
Refactor PeerToPeerAccess by moving the core implementation from aten/src/ATen/cuda to c10/cuda. This makes P2P and fabric access queries available at the c10 layer without requiring ATen dependencies. The ATen layer now provides thin wrappers that ensure CUDA lazy initialization before forwarding to c10. This separation allows lower-level CUDA code to query P2P capabilities without pulling in ATen context machinery. Differential Revision: [D92675476](https://our.internmc.facebook.com/intern/diff/D92675476/) ghstack-source-id: 339391993 Pull Request resolved: #174582
This PR needs a
|
Refactor PeerToPeerAccess by moving the core implementation from aten/src/ATen/cuda to c10/cuda. This makes P2P and fabric access queries available at the c10 layer without requiring ATen dependencies. The ATen layer now provides thin wrappers that ensure CUDA lazy initialization before forwarding to c10. This separation allows lower-level CUDA code to query P2P capabilities without pulling in ATen context machinery. Differential Revision: [D92675476](https://our.internmc.facebook.com/intern/diff/D92675476/) [ghstack-poisoned]
Pull Request resolved: #174582 Refactor PeerToPeerAccess by moving the core implementation from aten/src/ATen/cuda to c10/cuda. This makes P2P and fabric access queries available at the c10 layer without requiring ATen dependencies. The ATen layer now provides thin wrappers that ensure CUDA lazy initialization before forwarding to c10. This separation allows lower-level CUDA code to query P2P capabilities without pulling in ATen context machinery. ghstack-source-id: 339394212 @exported-using-ghexport Differential Revision: [D92675476](https://our.internmc.facebook.com/intern/diff/D92675476/)
| num_devices_ >= 0, | ||
| "p2p access cache not initialized. " | ||
| "Ensure c10::cuda::detail::init_p2p_access_cache() is called first."); | ||
| TORCH_CHECK( |
There was a problem hiding this comment.
Some of these should TORCH_CHECK_VALUE but that should be another PR
|
@pytorchbot merge |
Merge failedReason: Not merging any PRs at the moment because there is a merge blocking https://github.com/pytorch/pytorch/labels/ci:%20sev issue open at: Details for Dev Infra teamRaised by workflow job |
This PR needs a
|
|
@pytorchmergebot merge |
Merge failedReason: Not merging any PRs at the moment because there is a merge blocking https://github.com/pytorch/pytorch/labels/ci:%20sev issue open at: Details for Dev Infra teamRaised by workflow job |
|
@pytorchbot merge |
Merge failedReason: This PR has internal changes and must be landed via Phabricator! Please try reimporting/rexporting the PR! Details for Dev Infra teamRaised by workflow job |
|
@pytorchbot merge -i (Initiating merge automatically since Phabricator Diff has merged, merging with -i because oss signals were bypassed internally) |
Merge failedReason: This PR has internal changes and must be landed via Phabricator! Please try reimporting/rexporting the PR! Details for Dev Infra teamRaised by workflow job |
|
@pytorchbot merge -f 'merged internally' |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
@pytorchbot revert -m "Reverted automatically by pytorch's autorevert, to avoid this behaviour add the tag autorevert: disable" -c autorevert This PR is attributed to have caused regression in: Please investigate and fix the issues. |
|
Claude finished @pytorch-auto-revert[bot]'s task in 42s —— View job Analyzing Revert Request
DiagnosisThe revert is legitimate. The CI failure is a simple lint error caused by trailing whitespace. File: ("c10/cuda/PeerToPeerAccess.h", "c10/hip/PeerToPeerAccess.h"), ← trailing spaces hereThe linter ( How to Fix
This is a trivial fix - just remove the trailing spaces and the PR should pass CI. |
|
@pytorchbot successfully started a revert job. Check the current status here. |
This reverts commit c1bc0e9. Reverted #174582 on behalf of https://github.com/pytorch-auto-revert due to Reverted automatically by pytorch's autorevert, to avoid this behaviour add the tag autorevert: disable ([comment](#174582 (comment)))
|
@ngimel your PR has been successfully reverted. |
|
@pytorchbot merge -i (Initiating merge automatically since Phabricator Diff has merged, merging with -i because oss signals were bypassed internally) |
Merge startedYour change will be merged while ignoring the following 3 checks: Lint / lintrunner-noclang-all / linux-job, trunk / macos-py3-arm64 / build, Meta Internal-Only Changes Check Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Refactor PeerToPeerAccess by moving the core implementation from aten/src/ATen/cuda to c10/cuda. This makes P2P and fabric access queries available at the c10 layer without requiring ATen dependencies. The ATen layer now provides thin wrappers that ensure CUDA lazy initialization before forwarding to c10. This separation allows lower-level CUDA code to query P2P capabilities without pulling in ATen context machinery. Original diff by @minsii pytorch#173571 Bifferential Revision: [D92675476](https://our.internmc.facebook.com/intern/diff/D92675476/) Pull Request resolved: pytorch#174582 Approved by: https://github.com/Skylion007
This reverts commit c1bc0e9. Reverted pytorch#174582 on behalf of https://github.com/pytorch-auto-revert due to Reverted automatically by pytorch's autorevert, to avoid this behaviour add the tag autorevert: disable ([comment](pytorch#174582 (comment)))
Stack from ghstack (oldest at bottom):
Refactor PeerToPeerAccess by moving the core implementation from
aten/src/ATen/cuda to c10/cuda. This makes P2P and fabric access
queries available at the c10 layer without requiring ATen dependencies.
The ATen layer now provides thin wrappers that ensure CUDA lazy
initialization before forwarding to c10. This separation allows
lower-level CUDA code to query P2P capabilities without pulling in
ATen context machinery.
Original diff by @minsii #173571
Bifferential Revision: D92675476