[Inductor] Delay codegen for fallback arguments and improve typing#154371
[Inductor] Delay codegen for fallback arguments and improve typing#154371benjaminglass1 wants to merge 12 commits intogh/benjaminglass1/85/basefrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/154371
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (1 Unrelated Failure)As of commit 1d86eec with merge base 3819584 ( FLAKY - The following job failed but was likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
@pytorchbot revert -c nosignal -m "Appears to have broken main" |
|
@pytorchbot successfully started a revert job. Check the current status here. |
…I C-shim dispatching (#154371)" This reverts commit 6169ca0. Reverted #154371 on behalf of https://github.com/benjaminglass1 due to Appears to have broken main ([comment](#154371 (comment)))
|
@benjaminglass1 your PR has been successfully reverted. |
|
@henrylhtsang When CI finishes running (I fully believe it will pass, but just to be sure), I'm ready for you to test this PR internally. I've addressed the one plausible memory leak I can see by delaying code generation on some arguments that could theoretically get leaked. |
|
@henrylhtsang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
umm do I need to re-import |
|
@henrylhtsang Yes, sorry, I caught a bug locally. Should be gtg now. |
|
@henrylhtsang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
Yeah LGTM cc @desertfire |
|
umm it seems to regress the latency by 1%. Any possible ideas? |
|
@henrylhtsang Which latency? Compile-time latency? |
runtime latency |
|
@henrylhtsang I've sent some questions to you offline; I'll put any conclusions we come to in this PR. |
|
LGTM, false alarm |
|
@henrylhtsang Excellent! Once you reimport the (cosmetic) changes I made, we should be GTG! |
|
@henrylhtsang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Stack from ghstack (oldest at bottom):
Delays code generation for arguments to fallback ops. This is inspired by #155642, and likely fixes similar memory leaks.
Additionally, prepare for the next PR in the stack by tightening up typing on a
cpp_wrapperinterface that's only used in one (well-typed) place, as well as downstream effects of that change. In particular, this enabled:cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov