[Helion + torch.compile] Refactor TemplateBuffer as extensible base class#177063
[Helion + torch.compile] Refactor TemplateBuffer as extensible base class#177063yf225 wants to merge 5 commits intogh/yf225/135/basefrom
Conversation
Move common fields and methods up from TritonTemplateBuffer to TemplateBuffer so that all template subclasses (Triton, CuteDSL, external backends) share them: - Add mutated_inputs, allowed_prologue_inps to TemplateBuffer.__init__ - Move mutation_outputs setup from TritonTemplateBuffer to base class - Move get_outputs(), get_allowed_prologue_inps() up - Extract _read_deps_from_inputs() helper from extract_read_writes() - Remove can_fuse_multi_output_epilogue() (unused) - Simplify TritonTemplateBuffer to delegate to super().__init__() - Remove redundant self.outputs from CppTemplateBuffer [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/177063
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (1 Unrelated Failure)As of commit ddfde89 with merge base 59b048f ( UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
Move common fields and methods up from TritonTemplateBuffer to TemplateBuffer so that all template subclasses (Triton, CuteDSL, external backends) share them: - Add mutated_inputs, allowed_prologue_inps to TemplateBuffer.__init__ - Move mutation_outputs setup from TritonTemplateBuffer to base class - Move get_outputs(), get_allowed_prologue_inps() up - Extract _read_deps_from_inputs() helper from extract_read_writes() - Remove can_fuse_multi_output_epilogue() (unused) - Simplify TritonTemplateBuffer to delegate to super().__init__() - Remove redundant self.outputs from CppTemplateBuffer [ghstack-poisoned]
…ible base class" Move common fields and methods up from TritonTemplateBuffer to TemplateBuffer so that all template subclasses (Triton, CuteDSL, external backends) share them: - Add mutated_inputs, allowed_prologue_inps to TemplateBuffer.__init__ - Move mutation_outputs setup from TritonTemplateBuffer to base class - Move get_outputs(), get_allowed_prologue_inps() up - Extract _read_deps_from_inputs() helper from extract_read_writes() - Remove can_fuse_multi_output_epilogue() (unused) - Simplify TritonTemplateBuffer to delegate to super().__init__() - Remove redundant self.outputs from CppTemplateBuffer [ghstack-poisoned]
…lass Move common fields and methods up from TritonTemplateBuffer to TemplateBuffer so that all template subclasses (Triton, CuteDSL, external backends) share them: - Add mutated_inputs, allowed_prologue_inps to TemplateBuffer.__init__ - Move mutation_outputs setup from TritonTemplateBuffer to base class - Move get_outputs(), get_allowed_prologue_inps() up - Extract _read_deps_from_inputs() helper from extract_read_writes() - Remove can_fuse_multi_output_epilogue() (unused) - Simplify TritonTemplateBuffer to delegate to super().__init__() - Remove redundant self.outputs from CppTemplateBuffer ghstack-source-id: 91e9bd0 Pull Request resolved: #177063
…ible base class" Move common fields and methods up from TritonTemplateBuffer to TemplateBuffer so that all template subclasses (Triton, CuteDSL, external backends) share them: - Add mutated_inputs, allowed_prologue_inps to TemplateBuffer.__init__ - Move mutation_outputs setup from TritonTemplateBuffer to base class - Move get_outputs(), get_allowed_prologue_inps() up - Extract _read_deps_from_inputs() helper from extract_read_writes() - Remove can_fuse_multi_output_epilogue() (unused) - Simplify TritonTemplateBuffer to delegate to super().__init__() - Remove redundant self.outputs from CppTemplateBuffer cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo [ghstack-poisoned]
…lass Move common fields and methods from TritonTemplateBuffer up to TemplateBuffer so that external template backends (e.g. Helion) can reuse the same mutation-tracking and prologue-fusion infrastructure: - Add mutated_inputs, allowed_prologue_inps params to TemplateBuffer.__init__ - Build mutation_outputs list in base class (parallel to ExternKernel.mutation_outputs) - Move get_allowed_prologue_inps() to base class - Extract _read_deps_from_inputs() helper from extract_read_writes() - Remove can_fuse_multi_output_epilogue() (always returned False, unused) - Simplify TritonTemplateBuffer.__init__() to delegate to super() get_outputs() stays on TritonTemplateBuffer since it is the only subclass that currently passes mutated_inputs; other subclasses (CppTemplateBuffer, CuteDSLTemplateBuffer, etc.) manage their own output lists independently. ghstack-source-id: 91e9bd0 Pull Request resolved: #177063
…ible base class" Move common fields and methods up from TritonTemplateBuffer to TemplateBuffer so that all template subclasses (Triton, CuteDSL, external backends) share them: - Add mutated_inputs, allowed_prologue_inps to TemplateBuffer.__init__ - Move mutation_outputs setup from TritonTemplateBuffer to base class - Move get_outputs(), get_allowed_prologue_inps() up - Extract _read_deps_from_inputs() helper from extract_read_writes() - Remove can_fuse_multi_output_epilogue() (unused) - Simplify TritonTemplateBuffer to delegate to super().__init__() - Remove redundant self.outputs from CppTemplateBuffer cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo [ghstack-poisoned]
|
@pytorchbot merge -f "unrelated failures" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Move common fields and methods from TritonTemplateBuffer up to TemplateBuffer so that external template backends (e.g. Helion) can reuse the same mutation-tracking and prologue-fusion infrastructure: - Add mutated_inputs, allowed_prologue_inps params to TemplateBuffer.__init__ - Build mutation_outputs list in base class (parallel to ExternKernel.mutation_outputs) - Move get_allowed_prologue_inps() to base class - Extract _read_deps_from_inputs() helper from extract_read_writes() - Remove can_fuse_multi_output_epilogue() (always returned False, unused) - Simplify TritonTemplateBuffer.__init__() to delegate to super() get_outputs() stays on TritonTemplateBuffer since it is the only subclass that currently passes mutated_inputs; other subclasses (CppTemplateBuffer, CuteDSLTemplateBuffer, etc.) manage their own output lists independently. ghstack-source-id: 64bafb1 Pull Request resolved: pytorch#177063
…lass Move common fields and methods from TritonTemplateBuffer up to TemplateBuffer so that external template backends (e.g. Helion) can reuse the same mutation-tracking and prologue-fusion infrastructure: - Add mutated_inputs, allowed_prologue_inps params to TemplateBuffer.__init__ - Build mutation_outputs list in base class (parallel to ExternKernel.mutation_outputs) - Move get_allowed_prologue_inps() to base class - Extract _read_deps_from_inputs() helper from extract_read_writes() - Remove can_fuse_multi_output_epilogue() (always returned False, unused) - Simplify TritonTemplateBuffer.__init__() to delegate to super() get_outputs() stays on TritonTemplateBuffer since it is the only subclass that currently passes mutated_inputs; other subclasses (CppTemplateBuffer, CuteDSLTemplateBuffer, etc.) manage their own output lists independently. ghstack-source-id: 64bafb1 Pull Request resolved: pytorch/pytorch#177063
…e base class (#177063)" (#177360) This reverts commit f72b01e. Pull Request resolved: #177360 Approved by: https://github.com/huydhn
…ble base class (#177367) This is a reland of #177063. Move common fields and methods from TritonTemplateBuffer up to TemplateBuffer so that external template backends (e.g. Helion) can reuse the same mutation-tracking and prologue-fusion infrastructure: - Add mutated_inputs, allowed_prologue_inps params to TemplateBuffer.__init__ - Build mutation_outputs list in base class (parallel to ExternKernel.mutation_outputs) - Move get_allowed_prologue_inps() to base class - Extract _read_deps_from_inputs() helper from extract_read_writes() - Remove can_fuse_multi_output_epilogue() (always returned False, unused) - Simplify TritonTemplateBuffer.__init__() to delegate to super() get_outputs() stays on TritonTemplateBuffer since it is the only subclass that currently passes mutated_inputs; other subclasses (CppTemplateBuffer, CuteDSLTemplateBuffer, etc.) manage their own output lists independently. Pull Request resolved: #177367 Approved by: https://github.com/shunting314 ghstack dependencies: #177302
…ble base class (pytorch#177367) This is a reland of pytorch#177063. Move common fields and methods from TritonTemplateBuffer up to TemplateBuffer so that external template backends (e.g. Helion) can reuse the same mutation-tracking and prologue-fusion infrastructure: - Add mutated_inputs, allowed_prologue_inps params to TemplateBuffer.__init__ - Build mutation_outputs list in base class (parallel to ExternKernel.mutation_outputs) - Move get_allowed_prologue_inps() to base class - Extract _read_deps_from_inputs() helper from extract_read_writes() - Remove can_fuse_multi_output_epilogue() (always returned False, unused) - Simplify TritonTemplateBuffer.__init__() to delegate to super() get_outputs() stays on TritonTemplateBuffer since it is the only subclass that currently passes mutated_inputs; other subclasses (CppTemplateBuffer, CuteDSLTemplateBuffer, etc.) manage their own output lists independently. Pull Request resolved: pytorch#177367 Approved by: https://github.com/shunting314 ghstack dependencies: pytorch#177302
…lass (pytorch#177063) Move common fields and methods up from TritonTemplateBuffer to TemplateBuffer so that all template subclasses (Triton, CuteDSL, external backends) share them: - Add mutated_inputs, allowed_prologue_inps to TemplateBuffer.__init__ - Move mutation_outputs setup from TritonTemplateBuffer to base class - Move get_outputs(), get_allowed_prologue_inps() up - Extract _read_deps_from_inputs() helper from extract_read_writes() - Remove can_fuse_multi_output_epilogue() (unused) - Simplify TritonTemplateBuffer to delegate to super().__init__() - Remove redundant self.outputs from CppTemplateBuffer Pull Request resolved: pytorch#177063 Approved by: https://github.com/jansel ghstack dependencies: pytorch#177062
…e base class (pytorch#177063)" (pytorch#177360) This reverts commit f72b01e. Pull Request resolved: pytorch#177360 Approved by: https://github.com/huydhn
…ble base class (pytorch#177367) This is a reland of pytorch#177063. Move common fields and methods from TritonTemplateBuffer up to TemplateBuffer so that external template backends (e.g. Helion) can reuse the same mutation-tracking and prologue-fusion infrastructure: - Add mutated_inputs, allowed_prologue_inps params to TemplateBuffer.__init__ - Build mutation_outputs list in base class (parallel to ExternKernel.mutation_outputs) - Move get_allowed_prologue_inps() to base class - Extract _read_deps_from_inputs() helper from extract_read_writes() - Remove can_fuse_multi_output_epilogue() (always returned False, unused) - Simplify TritonTemplateBuffer.__init__() to delegate to super() get_outputs() stays on TritonTemplateBuffer since it is the only subclass that currently passes mutated_inputs; other subclasses (CppTemplateBuffer, CuteDSLTemplateBuffer, etc.) manage their own output lists independently. Pull Request resolved: pytorch#177367 Approved by: https://github.com/shunting314 ghstack dependencies: pytorch#177302
Stack from ghstack (oldest at bottom):
Move common fields and methods up from TritonTemplateBuffer to
TemplateBuffer so that all template subclasses (Triton, CuteDSL,
external backends) share them:
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @jataylo