[inductor] Use custom triton kernel subclass when available#167456
[inductor] Use custom triton kernel subclass when available#167456kundaMwiza wants to merge 22 commits intopytorch:mainfrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/167456
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (1 Unrelated Failure)As of commit b342108 with merge base ed18b31 ( FLAKY - The following job failed but was likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@pytorchbot label "topic: not user facing" |
jansel
left a comment
There was a problem hiding this comment.
Failing tests?
Is there a test we could add to check this new behavior?
e7d7329 to
04cc3ca
Compare
| filename=None, | ||
| inductor_meta=None, | ||
| custom_kernel=False, | ||
| caching_autotuner_cls: type[CachingAutotuner] = CachingAutotuner, |
There was a problem hiding this comment.
Allows custom heuristics modules to pass in their subclasses
| def define_subgraph_launcher_fn(self, name: str, subgraph_code): | ||
| self.subgraph_definitions.splice(subgraph_code.value) | ||
|
|
||
| @classmethod |
There was a problem hiding this comment.
I couldn't put the kernel type on the class because there would be an import cycle. Happy to know of other alternatives to this
|
|
||
|
|
||
| @unittest.skipIf(IS_FBCODE, "cpp_extension doesn't work in fbcode right now") | ||
| @test_torchinductor.skip_if_cpp_wrapper( |
There was a problem hiding this comment.
This decorator only works on test methods, so this test class is currently not discoverable on main
| ) | ||
|
|
||
| @requires_cuda_and_triton | ||
| def test_codegen_with_custom_heuristics_module(self): |
| Construct @triton.heuristics() based on size_hints. | ||
| """ | ||
| configs = [triton_heuristics.Config({"XBLOCK": 32})] | ||
| return triton_heuristics.cached_autotune( |
There was a problem hiding this comment.
Example out of tree backends with their own heuristic modules:
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
@pytorchbot revert -m "failed internal test Diff D87660150 , errorl ModuleNotFoundError: No module named 'extension_backends'" -c ghfirst |
|
@pytorchbot successfully started a revert job. Check the current status here. |
|
@jansel The failing jobs are also broken on trunk - they are just the dynamic shapes variants of the tests. I had to rebase this PR with main rather than viable/strict because of merge conflicts with main |
Use classmethod instead of staticmethod
289a5a3 to
7fe953c
Compare
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
This refactor replaces direct uses of TritonKernel in cases where a subclass type is available since out of tree / custom backends can:
inductor_metavia aTritonKernelsubclass for the autotuner to handleExample out of tree backends with their own heuristic modules:
Ascend NPU: https://github.com/Ascend/pytorch/blob/045a034dbcec287a5997aa13fd129a1cd6b1e215/torch_npu/_inductor/npu_triton_heuristics.py#L4
Intel XPU: https://github.com/intel/intel-extension-for-pytorch/blob/5dcc9d57e5422cf295e1a1ee97896d6b6a554a85/intel_extension_for_pytorch/_inductor/xpu/triton_ops/autotune.py
It also adds a
triton_meta_commonmethod that is analogous toinductor_meta_commonthat is overridable, so that compile options can be directly provided.Test plan:
Added unit tests to test_triton_extension_backend.py
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @jataylo @chenyang78