[coor-targets] Accept ProcessGroup directly in functional collective op schemas#172795
[coor-targets] Accept ProcessGroup directly in functional collective op schemas#172795aorenste wants to merge 13 commits intogh/aorenste/184/basefrom
Conversation
…ring or ProcessGroup inputs [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/172795
Note: Links to docs will display an error until the docs builds have been completed. ⏳ 1 Pending, 1 Unrelated FailureAs of commit 2080c58 with merge base 79546f5 ( FLAKY - The following job failed but was likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
…ake both string or ProcessGroup inputs" [ghstack-poisoned]
…ake both string or ProcessGroup inputs" [ghstack-poisoned]
…collective op schemas" Change _c10d_functional and _dtensor op schemas from `str group_name` to `Any group` so callers can pass either a string name (resolved via the group registry) or a ProcessGroup object directly, avoiding an unnecessary registry round-trip. Each C++ collective gets a ProcessGroup-typed overload (the real implementation) and a string-typed overload (thin wrapper that resolves and delegates). New dispatch functions handle the runtime `Any` by checking whether the IValue is a string or capsule. On the Python side, `_resolve_group()` replaces `_resolve_group_name()` and returns ProcessGroup objects when possible. The AOTInductor proxy executor gains AnyType argument handling. This is intended to be used by compile-on-one-rank to avoid turning ProcessGroups into constant strings. [ghstack-poisoned]
…op schemas Change _c10d_functional and _dtensor op schemas from `str group_name` to `Any group` so callers can pass either a string name (resolved via the group registry) or a ProcessGroup object directly, avoiding an unnecessary registry round-trip. Each C++ collective gets a ProcessGroup-typed overload (the real implementation) and a string-typed overload (thin wrapper that resolves and delegates). New dispatch functions handle the runtime `Any` by checking whether the IValue is a string or capsule. On the Python side, `_resolve_group()` replaces `_resolve_group_name()` and returns ProcessGroup objects when possible. The AOTInductor proxy executor gains AnyType argument handling. Authored with Claude. ghstack-source-id: 527fedc Pull Request resolved: #172795
…collective op schemas" Change _c10d_functional and _dtensor op schemas from `str group_name` to `Any group` so callers can pass either a string name (resolved via the group registry) or a ProcessGroup object directly, avoiding an unnecessary registry round-trip. Each C++ collective gets a ProcessGroup-typed overload (the real implementation) and a string-typed overload (thin wrapper that resolves and delegates). New dispatch functions handle the runtime `Any` by checking whether the IValue is a string or capsule. On the Python side, `_resolve_group()` replaces `_resolve_group_name()` and returns ProcessGroup objects when possible. The AOTInductor proxy executor gains AnyType argument handling. This is intended to be used by compile-on-one-rank to avoid turning ProcessGroups into constant strings. [ghstack-poisoned]
…op schemas Change _c10d_functional and _dtensor op schemas from `str group_name` to `Any group` so callers can pass either a string name (resolved via the group registry) or a ProcessGroup object directly, avoiding an unnecessary registry round-trip. Each C++ collective gets a ProcessGroup-typed overload (the real implementation) and a string-typed overload (thin wrapper that resolves and delegates). New dispatch functions handle the runtime `Any` by checking whether the IValue is a string or capsule. On the Python side, `_resolve_group()` replaces `_resolve_group_name()` and returns ProcessGroup objects when possible. The AOTInductor proxy executor gains AnyType argument handling. Authored with Claude. ghstack-source-id: 04a2d91 Pull Request resolved: #172795
…collective op schemas" Change _c10d_functional and _dtensor op schemas from `str group_name` to `Any group` so callers can pass either a string name (resolved via the group registry) or a ProcessGroup object directly, avoiding an unnecessary registry round-trip. Each C++ collective gets a ProcessGroup-typed overload (the real implementation) and a string-typed overload (thin wrapper that resolves and delegates). New dispatch functions handle the runtime `Any` by checking whether the IValue is a string or capsule. On the Python side, `_resolve_group()` replaces `_resolve_group_name()` and returns ProcessGroup objects when possible. The AOTInductor proxy executor gains AnyType argument handling. This is intended to be used by compile-on-one-rank to avoid turning ProcessGroups into constant strings. [ghstack-poisoned]
…op schemas Change _c10d_functional and _dtensor op schemas from `str group_name` to `Any group` so callers can pass either a string name (resolved via the group registry) or a ProcessGroup object directly, avoiding an unnecessary registry round-trip. Each C++ collective gets a ProcessGroup-typed overload (the real implementation) and a string-typed overload (thin wrapper that resolves and delegates). New dispatch functions handle the runtime `Any` by checking whether the IValue is a string or capsule. On the Python side, `_resolve_group()` replaces `_resolve_group_name()` and returns ProcessGroup objects when possible. The AOTInductor proxy executor gains AnyType argument handling. Authored with Claude. ghstack-source-id: b906563 Pull Request resolved: #172795
…collective op schemas" Change _c10d_functional and _dtensor op schemas from `str group_name` to `Any group` so callers can pass either a string name (resolved via the group registry) or a ProcessGroup object directly, avoiding an unnecessary registry round-trip. Each C++ collective gets a ProcessGroup-typed overload (the real implementation) and a string-typed overload (thin wrapper that resolves and delegates). New dispatch functions handle the runtime `Any` by checking whether the IValue is a string or capsule. On the Python side, `_resolve_group()` replaces `_resolve_group_name()` and returns ProcessGroup objects when possible. The AOTInductor proxy executor gains AnyType argument handling. This is intended to be used by compile-on-one-rank to avoid turning ProcessGroups into constant strings. [ghstack-poisoned]
…op schemas Change _c10d_functional and _dtensor op schemas from `str group_name` to `Any group` so callers can pass either a string name (resolved via the group registry) or a ProcessGroup object directly, avoiding an unnecessary registry round-trip. Each C++ collective gets a ProcessGroup-typed overload (the real implementation) and a string-typed overload (thin wrapper that resolves and delegates). New dispatch functions handle the runtime `Any` by checking whether the IValue is a string or capsule. On the Python side, `_resolve_group()` replaces `_resolve_group_name()` and returns ProcessGroup objects when possible. The AOTInductor proxy executor gains AnyType argument handling. Authored with Claude. ghstack-source-id: 05d9dbf Pull Request resolved: #172795
…ring or ProcessGroup inputs ghstack-source-id: ae449aa Pull Request resolved: pytorch/pytorch#172795
…collective op schemas" Change _c10d_functional and _dtensor op schemas from `str group_name` to `Any group` so callers can pass either a string name (resolved via the group registry) or a ProcessGroup object directly, avoiding an unnecessary registry round-trip. Each C++ collective gets a ProcessGroup-typed overload (the real implementation) and a string-typed overload (thin wrapper that resolves and delegates). New dispatch functions handle the runtime `Any` by checking whether the IValue is a string or capsule. On the Python side, `_resolve_group()` replaces `_resolve_group_name()` and returns ProcessGroup objects when possible. The AOTInductor proxy executor gains AnyType argument handling. This is intended to be used by compile-on-one-rank to avoid turning ProcessGroups into constant strings. [ghstack-poisoned]
…op schemas Change _c10d_functional and _dtensor op schemas from `str group_name` to `Any group` so callers can pass either a string name (resolved via the group registry) or a ProcessGroup object directly, avoiding an unnecessary registry round-trip. Each C++ collective gets a ProcessGroup-typed overload (the real implementation) and a string-typed overload (thin wrapper that resolves and delegates). New dispatch functions handle the runtime `Any` by checking whether the IValue is a string or capsule. On the Python side, `_resolve_group()` replaces `_resolve_group_name()` and returns ProcessGroup objects when possible. The AOTInductor proxy executor gains AnyType argument handling. Authored with Claude. ghstack-source-id: 94fd9b1 Pull Request resolved: #172795
|
Claude finished @pytorch-auto-revert[bot]'s task in 11m 10s —— View job Diagnosis of Revert for PR #172795
Failing TestsAll failures are in
Root CauseThese tests use # From test_compiled_isend_graph (line 213):
tensor: "f32[0]" = torch.ops._c10d_functional.isend(l_tensor_, 1, 0, '0')The PR changed op schemas from The critical code path is:
The PR introduced The most likely issue is that the PR changed something in how the P2P compilation path passes the group argument — either:
Note: I was unable to access the actual CI logs (the jobs show Is the Revert Legitimate?Yes — this is a legitimate revert. The PR broke 5 existing P2P compilation tests that were not updated to match the new graph output format. These tests verify the compiled graph structure for distributed P2P operations, and graph correctness is critical. How to FixThe author should:
The cleanest fix is likely option (2): keep P2P ops using |
|
@pytorchbot successfully started a revert job. Check the current status here. |
|
@aorenste your PR has been successfully reverted. |
…lective op schemas (#172795)" This reverts commit 71352fa. Reverted #172795 on behalf of https://github.com/pytorch-auto-revert due to Reverted automatically by pytorch's autorevert, to avoid this behaviour add the tag autorevert: disable ([comment](#172795 (comment)))
…collective op schemas" Change _c10d_functional and _dtensor op schemas from `str group_name` to `Any group` so callers can pass either a string name (resolved via the group registry) or a ProcessGroup object directly, avoiding an unnecessary registry round-trip. Each C++ collective gets a ProcessGroup-typed overload (the real implementation) and a string-typed overload (thin wrapper that resolves and delegates). New dispatch functions handle the runtime `Any` by checking whether the IValue is a string or capsule. On the Python side, `_resolve_group()` replaces `_resolve_group_name()` and returns ProcessGroup objects when possible. The AOTInductor proxy executor gains AnyType argument handling. This is intended to be used by compile-on-one-rank to avoid turning ProcessGroups into constant strings. [ghstack-poisoned]
…collective op schemas" Change _c10d_functional and _dtensor op schemas from `str group_name` to `Any group` so callers can pass either a string name (resolved via the group registry) or a ProcessGroup object directly, avoiding an unnecessary registry round-trip. Each C++ collective gets a ProcessGroup-typed overload (the real implementation) and a string-typed overload (thin wrapper that resolves and delegates). New dispatch functions handle the runtime `Any` by checking whether the IValue is a string or capsule. On the Python side, `_resolve_group()` replaces `_resolve_group_name()` and returns ProcessGroup objects when possible. The AOTInductor proxy executor gains AnyType argument handling. This is intended to be used by compile-on-one-rank to avoid turning ProcessGroups into constant strings. [ghstack-poisoned]
…collective op schemas" Change _c10d_functional and _dtensor op schemas from `str group_name` to `Any group` so callers can pass either a string name (resolved via the group registry) or a ProcessGroup object directly, avoiding an unnecessary registry round-trip. Each C++ collective gets a ProcessGroup-typed overload (the real implementation) and a string-typed overload (thin wrapper that resolves and delegates). New dispatch functions handle the runtime `Any` by checking whether the IValue is a string or capsule. On the Python side, `_resolve_group()` replaces `_resolve_group_name()` and returns ProcessGroup objects when possible. The AOTInductor proxy executor gains AnyType argument handling. This is intended to be used by compile-on-one-rank to avoid turning ProcessGroups into constant strings. [ghstack-poisoned]
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Pull Request resolved: #177446 Approved by: https://github.com/Skylion007 ghstack dependencies: #172795
…op schemas (pytorch#172795) Change _c10d_functional and _dtensor op schemas from `str group_name` to `Any group` so callers can pass either a string name (resolved via the group registry) or a ProcessGroup object directly, avoiding an unnecessary registry round-trip. Each C++ collective gets a ProcessGroup-typed overload (the real implementation) and a string-typed overload (thin wrapper that resolves and delegates). New dispatch functions handle the runtime `Any` by checking whether the IValue is a string or capsule. On the Python side, `_resolve_group()` replaces `_resolve_group_name()` and returns ProcessGroup objects when possible. The AOTInductor proxy executor gains AnyType argument handling. This is intended to be used by compile-on-one-rank to avoid turning ProcessGroups into constant strings. Pull Request resolved: pytorch#172795 Approved by: https://github.com/angelayi
Pull Request resolved: pytorch#177446 Approved by: https://github.com/Skylion007 ghstack dependencies: pytorch#172795
…op schemas (pytorch#172795) Change _c10d_functional and _dtensor op schemas from `str group_name` to `Any group` so callers can pass either a string name (resolved via the group registry) or a ProcessGroup object directly, avoiding an unnecessary registry round-trip. Each C++ collective gets a ProcessGroup-typed overload (the real implementation) and a string-typed overload (thin wrapper that resolves and delegates). New dispatch functions handle the runtime `Any` by checking whether the IValue is a string or capsule. On the Python side, `_resolve_group()` replaces `_resolve_group_name()` and returns ProcessGroup objects when possible. The AOTInductor proxy executor gains AnyType argument handling. This is intended to be used by compile-on-one-rank to avoid turning ProcessGroups into constant strings. Pull Request resolved: pytorch#172795 Approved by: https://github.com/angelayi
…lective op schemas (pytorch#172795)" This reverts commit 71352fa. Reverted pytorch#172795 on behalf of https://github.com/pytorch-auto-revert due to Reverted automatically by pytorch's autorevert, to avoid this behaviour add the tag autorevert: disable ([comment](pytorch#172795 (comment)))
…op schemas (pytorch#172795) Change _c10d_functional and _dtensor op schemas from `str group_name` to `Any group` so callers can pass either a string name (resolved via the group registry) or a ProcessGroup object directly, avoiding an unnecessary registry round-trip. Each C++ collective gets a ProcessGroup-typed overload (the real implementation) and a string-typed overload (thin wrapper that resolves and delegates). New dispatch functions handle the runtime `Any` by checking whether the IValue is a string or capsule. On the Python side, `_resolve_group()` replaces `_resolve_group_name()` and returns ProcessGroup objects when possible. The AOTInductor proxy executor gains AnyType argument handling. This is intended to be used by compile-on-one-rank to avoid turning ProcessGroups into constant strings. Pull Request resolved: pytorch#172795 Approved by: https://github.com/angelayi
Pull Request resolved: pytorch#177446 Approved by: https://github.com/Skylion007 ghstack dependencies: pytorch#172795
…op schemas (pytorch#172795) Change _c10d_functional and _dtensor op schemas from `str group_name` to `Any group` so callers can pass either a string name (resolved via the group registry) or a ProcessGroup object directly, avoiding an unnecessary registry round-trip. Each C++ collective gets a ProcessGroup-typed overload (the real implementation) and a string-typed overload (thin wrapper that resolves and delegates). New dispatch functions handle the runtime `Any` by checking whether the IValue is a string or capsule. On the Python side, `_resolve_group()` replaces `_resolve_group_name()` and returns ProcessGroup objects when possible. The AOTInductor proxy executor gains AnyType argument handling. This is intended to be used by compile-on-one-rank to avoid turning ProcessGroups into constant strings. Pull Request resolved: pytorch#172795 Approved by: https://github.com/angelayi
…lective op schemas (pytorch#172795)" This reverts commit 71352fa. Reverted pytorch#172795 on behalf of https://github.com/pytorch-auto-revert due to Reverted automatically by pytorch's autorevert, to avoid this behaviour add the tag autorevert: disable ([comment](pytorch#172795 (comment)))
…op schemas (pytorch#172795) Change _c10d_functional and _dtensor op schemas from `str group_name` to `Any group` so callers can pass either a string name (resolved via the group registry) or a ProcessGroup object directly, avoiding an unnecessary registry round-trip. Each C++ collective gets a ProcessGroup-typed overload (the real implementation) and a string-typed overload (thin wrapper that resolves and delegates). New dispatch functions handle the runtime `Any` by checking whether the IValue is a string or capsule. On the Python side, `_resolve_group()` replaces `_resolve_group_name()` and returns ProcessGroup objects when possible. The AOTInductor proxy executor gains AnyType argument handling. This is intended to be used by compile-on-one-rank to avoid turning ProcessGroups into constant strings. Pull Request resolved: pytorch#172795 Approved by: https://github.com/angelayi
Pull Request resolved: pytorch#177446 Approved by: https://github.com/Skylion007 ghstack dependencies: pytorch#172795
Change _c10d_functional and _dtensor op schemas from
str group_nameto
Any groupso callers can pass either a string name (resolved viathe group registry) or a ProcessGroup object directly, avoiding an
unnecessary registry round-trip.
Each C++ collective gets a ProcessGroup-typed overload (the real
implementation) and a string-typed overload (thin wrapper that resolves
and delegates). New dispatch functions handle the runtime
Anybychecking whether the IValue is a string or capsule.
On the Python side,
_resolve_group()replaces_resolve_group_name()and returns ProcessGroup objects when possible. The AOTInductor proxy
executor gains AnyType argument handling.
This is intended to be used by compile-on-one-rank to avoid turning
ProcessGroups into constant strings.
Stack from ghstack (oldest at bottom):