handle multiple reductions in node splits & read/write normalization#168013
handle multiple reductions in node splits & read/write normalization#168013eellison wants to merge 4 commits intogh/eellison/872/basefrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/168013
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (9 Unrelated Failures)As of commit 9a2a344 with merge base 015826f ( FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@pytorchbot rebase |
|
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
|
Rebase failed due to Command Raised by https://github.com/pytorch/pytorch/actions/runs/19477738026 |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 jobs have failed, first few of them are: inductor / inductor-test / test (inductor_timm, 1, 2, linux.g5.4xlarge.nvidia.gpu) Details for Dev Infra teamRaised by workflow job |
ghstack-source-id: 5d2ab07 Pull Request resolved: pytorch/pytorch#168013
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 2 jobs have failed, first few of them are: trunk / linux-jammy-rocm-py3.10 / test (default, 3, 6, linux.rocm.gpu.gfx942.1), trunk / linux-jammy-rocm-py3.10 / test (default, 1, 6, linux.rocm.gpu.gfx942.1) Details for Dev Infra teamRaised by workflow job |
|
@pytorchbot merge -i |
Merge startedYour change will be merged while ignoring the following 2 checks: trunk / linux-jammy-rocm-py3.10 / test (default, 3, 6, linux.rocm.gpu.gfx942.1), trunk / linux-jammy-rocm-py3.10 / test (default, 1, 6, linux.rocm.gpu.gfx942.1) Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command |
|
@pytorchbot merge -i "rocm tests failing" |
|
❌ 🤖 pytorchbot command failed: Try |
|
@pytorchbot merge -i "all rocm tests failing, unrelated" |
|
❌ 🤖 pytorchbot command failed: Try |
|
@pytorchbot merge -i |
…ytorch#168013) Another partial fix to pytorch#166653: We had not yet handled multiple reduction vars in tiling splits, which led to the coalesce analysis not seeing the vars as coalesced. See, updated P2043574063, which shows the kernel with correct coalescing by rblock (private link bc private model). Pull Request resolved: pytorch#168013 Approved by: https://github.com/shunting314
…ytorch#168013) Another partial fix to pytorch#166653: We had not yet handled multiple reduction vars in tiling splits, which led to the coalesce analysis not seeing the vars as coalesced. See, updated P2043574063, which shows the kernel with correct coalescing by rblock (private link bc private model). Pull Request resolved: pytorch#168013 Approved by: https://github.com/shunting314
Stack from ghstack (oldest at bottom):
Another partial fix to #166653:
We had not yet handled multiple reduction vars in tiling splits, which led to the coalesce analysis not seeing the vars as coalesced. See, updated P2043574063, which shows the kernel with correct coalescing by rblock (private link bc private model).
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @jataylo @chenyang78