Revert "[Helion + torch.compile] Fix MultiOutput write deps to eliminate fusion workarounds (#177062)"#177359
Revert "[Helion + torch.compile] Fix MultiOutput write deps to eliminate fusion workarounds (#177062)"#177359
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/177359
Note: Links to docs will display an error until the docs builds have been completed. ⏳ No Failures, 68 PendingAs of commit 83daeaf with merge base b989649 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
|
@pytorchbot merge -f "revert PR" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
… score matching (#177302) Give MultiOutput proper MemoryDep writes derived from its own FixedLayout instead of inheriting StarDep from InputsKernel. This removes the hack in FusedSchedulerNode.fuse() that copied index expressions from the template parent. Extend score_fusion_memory to use name-based dep matching for templates so that views/reshapes between template outputs and epilogues do not block fusion. This PR also now contains the changes from #177359, with proper fixes to avoid breaking internal tests. Pull Request resolved: #177302 Approved by: https://github.com/shunting314, https://github.com/jansel
… score matching (pytorch#177302) Give MultiOutput proper MemoryDep writes derived from its own FixedLayout instead of inheriting StarDep from InputsKernel. This removes the hack in FusedSchedulerNode.fuse() that copied index expressions from the template parent. Extend score_fusion_memory to use name-based dep matching for templates so that views/reshapes between template outputs and epilogues do not block fusion. This PR also now contains the changes from pytorch#177359, with proper fixes to avoid breaking internal tests. Pull Request resolved: pytorch#177302 Approved by: https://github.com/shunting314, https://github.com/jansel
…ate fusion workarounds (pytorch#177062)" (pytorch#177359) This reverts commit 648a664. This should land after pytorch#177360. Pull Request resolved: pytorch#177359 Approved by: https://github.com/huydhn
… score matching (pytorch#177302) Give MultiOutput proper MemoryDep writes derived from its own FixedLayout instead of inheriting StarDep from InputsKernel. This removes the hack in FusedSchedulerNode.fuse() that copied index expressions from the template parent. Extend score_fusion_memory to use name-based dep matching for templates so that views/reshapes between template outputs and epilogues do not block fusion. This PR also now contains the changes from pytorch#177359, with proper fixes to avoid breaking internal tests. Pull Request resolved: pytorch#177302 Approved by: https://github.com/shunting314, https://github.com/jansel
This reverts commit 648a664.
This should land after #177360.
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @jataylo