Implement c10d::reduce_scatter_ (list API) in LocalTensorMode#175710
Implement c10d::reduce_scatter_ (list API) in LocalTensorMode#175710ezyang wants to merge 5 commits intogh/ezyang/3274/basefrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/175710
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New Failure, 3 Unrelated FailuresAs of commit d1a08d4 with merge base c15f33d ( NEW FAILURE - The following job has failed:
FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
The list-based reduce_scatter_ op was the only reduce_scatter variant missing from LocalTensorMode's dispatch table. This is the variant used by `dist.reduce_scatter(output, input_list)`, which is the only API that supports uneven split sizes across ranks. Authored with Claude. ghstack-source-id: 7b294c1 Pull-Request: #175710
The list-based reduce_scatter_ op was the only reduce_scatter variant missing from LocalTensorMode's dispatch table. This is the variant used by `dist.reduce_scatter(output, input_list)`, which is the only API that supports uneven split sizes across ranks. Authored with Claude. ghstack-source-id: 4300002 Pull-Request: #175710
The list-based reduce_scatter_ op was the only reduce_scatter variant missing from LocalTensorMode's dispatch table. This is the variant used by `dist.reduce_scatter(output, input_list)`, which is the only API that supports uneven split sizes across ranks. Authored with Claude. ghstack-source-id: 8a10383 Pull-Request: #175710
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 mandatory check(s) failed. The first few are: Dig deeper by viewing the failures on hud |
|
@claude fix lint and trigger merge |
|
Claude finished @ezyang's task in 2m 44s —— View job Fixing lint and triggering merge
|
636e102 to
c6435b8
Compare
The list-based reduce_scatter_ op was the only reduce_scatter variant missing from LocalTensorMode's dispatch table. This is the variant used by `dist.reduce_scatter(output, input_list)`, which is the only API that supports uneven split sizes across ranks. Authored with Claude. ghstack-source-id: 7e81d41 Pull-Request: #175710
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
@pytorchbot revert -m "Reverted automatically by pytorch's autorevert, to avoid this behaviour add the tag autorevert: disable" -c autorevert This PR is attributed to have caused regression in: Please investigate and fix the issues. |
|
Claude finished @pytorch-auto-revert[bot]'s task in 1m 1s —— View job Revert Diagnosis
Root CauseThe lint failure is RUFF S101 ("Use of Is the revert legitimate?Yes, this is a legitimate revert. The lint failure is a real issue — the rest of the file consistently uses How to fixReplace the bare # Lines 619-620
assert len(output_tensors) == 1
assert len(input_tensors) == 1to: if len(output_tensors) != 1:
raise AssertionError
if len(input_tensors) != 1:
raise AssertionError# Line 628
assert isinstance(output_tensor, LocalTensor), "Output tensor must be a LocalTensor"to: if not isinstance(output_tensor, LocalTensor):
raise AssertionError("Output tensor must be a LocalTensor")# Lines 629-631
assert len(input_list) == len(ranks), (
f"Number of input chunks ({len(input_list)}) must match number of ranks ({len(ranks)})"
)to: if len(input_list) != len(ranks):
raise AssertionError(
f"Number of input chunks ({len(input_list)}) must match number of ranks ({len(ranks)})"
)# Line 641
assert isinstance(chunk, LocalTensor), "Input chunk must be a LocalTensor"to: if not isinstance(chunk, LocalTensor):
raise AssertionError("Input chunk must be a LocalTensor")This matches the established pattern used by every other function in the file (e.g., |
|
@pytorchbot successfully started a revert job. Check the current status here. |
…#175710)" This reverts commit 90035af. Reverted #175710 on behalf of https://github.com/pytorch-auto-revert due to Reverted automatically by pytorch's autorevert, to avoid this behaviour add the tag autorevert: disable ([comment](#175710 (comment)))
|
@ezyang your PR has been successfully reverted. |
The list-based reduce_scatter_ op was the only reduce_scatter variant missing from LocalTensorMode's dispatch table. This is the variant used by `dist.reduce_scatter(output, input_list)`, which is the only API that supports uneven split sizes across ranks. Authored with Claude. ghstack-source-id: 55246e5 Pull-Request: #175710
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 jobs have failed, first few of them are: trunk / linux-jammy-rocm-py3.10 / test (default, 5, 6, linux.rocm.gpu.gfx950.1) Details for Dev Infra teamRaised by workflow job |
…h#175710) The list-based reduce_scatter_ op was the only reduce_scatter variant missing from LocalTensorMode's dispatch table. This is the variant used by `dist.reduce_scatter(output, input_list)`, which is the only API that supports uneven split sizes across ranks. Authored with Claude. Pull Request resolved: pytorch#175710 Approved by: https://github.com/dzmitry-huba
…pytorch#175710)" This reverts commit 90035af. Reverted pytorch#175710 on behalf of https://github.com/pytorch-auto-revert due to Reverted automatically by pytorch's autorevert, to avoid this behaviour add the tag autorevert: disable ([comment](pytorch#175710 (comment)))
Stack from ghstack (oldest at bottom):
The list-based reduce_scatter_ op was the only reduce_scatter variant
missing from LocalTensorMode's dispatch table. This is the variant used
by
dist.reduce_scatter(output, input_list), which is the only API thatsupports uneven split sizes across ranks.
Authored with Claude.