[inductor][addmm] incorporate into new get_mm_configs properly#161534
[inductor][addmm] incorporate into new get_mm_configs properly#161534coconutruben wants to merge 33 commits intogh/coconutruben/55/basefrom
Conversation
\# why - addmm aten running with an expanded version of bias vs the regular bias sometimes causes numerics differences - to avoid this for now, we make addmm aten use inp vs inp_expanded depending on if we're in max-autotune or not, matching the previous logic \# what - expand KernelInputs to also store views of specific nodes, by names - use that view (inp, the unexpanded version) in the heuristics to adjust it depending on whether we're in max-autotune or not \# testing ``` python3 -bb -m pytest test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_codegen_cpu ``` [ghstack-poisoned]
|
@coconutruben has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
…erly" # why - addmm aten running with an expanded version of bias vs the regular bias sometimes causes numerics differences - to avoid this for now, we make addmm aten use inp vs inp_expanded depending on if we're in max-autotune or not, matching the previous logic # what - pass unexpanded bias (inp) - let template (heuristics) that it to be expanded (ATen in not max-autotune, Triton always) expand it # testing ``` python3 -bb -m pytest test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_codegen_cpu ``` cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov Differential Revision: [D81520581](https://our.internmc.facebook.com/intern/diff/D81520581) [ghstack-poisoned]
…erly" # why - addmm aten running with an expanded version of bias vs the regular bias sometimes causes numerics differences - to avoid this for now, we make addmm aten use inp vs inp_expanded depending on if we're in max-autotune or not, matching the previous logic # what - pass unexpanded bias (inp) - let template (heuristics) that it to be expanded (ATen in not max-autotune, Triton always) expand it # testing ``` python3 -bb -m pytest test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_codegen_cpu ``` cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov Differential Revision: [D81520581](https://our.internmc.facebook.com/intern/diff/D81520581) [ghstack-poisoned]
|
@coconutruben has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
…erly" # why - addmm aten running with an expanded version of bias vs the regular bias sometimes causes numerics differences - to avoid this for now, we make addmm aten use inp vs inp_expanded depending on if we're in max-autotune or not, matching the previous logic # what - pass unexpanded bias (inp) - let template (heuristics) that it to be expanded (ATen in not max-autotune, Triton always) expand it # testing ``` python3 -bb -m pytest test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_codegen_cpu ``` cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov Differential Revision: [D81520581](https://our.internmc.facebook.com/intern/diff/D81520581) [ghstack-poisoned]
…erly" # why - addmm aten running with an expanded version of bias vs the regular bias sometimes causes numerics differences - to avoid this for now, we make addmm aten use inp vs inp_expanded depending on if we're in max-autotune or not, matching the previous logic # what - pass unexpanded bias (inp) - let template (heuristics) that it to be expanded (ATen in not max-autotune, Triton always) expand it # testing ``` python3 -bb -m pytest test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_codegen_cpu ``` cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov Differential Revision: [D81520581](https://our.internmc.facebook.com/intern/diff/D81520581) [ghstack-poisoned]
…erly" # why - addmm aten running with an expanded version of bias vs the regular bias sometimes causes numerics differences - to avoid this for now, we make addmm aten use inp vs inp_expanded depending on if we're in max-autotune or not, matching the previous logic # what - pass unexpanded bias (inp) - let template (heuristics) that it to be expanded (ATen in not max-autotune, Triton always) expand it # testing ``` python3 -bb -m pytest test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_codegen_cpu ``` cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov Differential Revision: [D81520581](https://our.internmc.facebook.com/intern/diff/D81520581) [ghstack-poisoned]
|
@coconutruben has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
…erly" # why - addmm aten running with an expanded version of bias vs the regular bias sometimes causes numerics differences - to avoid this for now, we make addmm aten use inp vs inp_expanded depending on if we're in max-autotune or not, matching the previous logic # what - pass unexpanded bias (inp) - let template (heuristics) that it to be expanded (ATen in not max-autotune, Triton always) expand it # testing ``` python3 -bb -m pytest test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_codegen_cpu ``` cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov Differential Revision: [D81520581](https://our.internmc.facebook.com/intern/diff/D81520581) [ghstack-poisoned]
…erly" # why - addmm aten running with an expanded version of bias vs the regular bias sometimes causes numerics differences - to avoid this for now, we make addmm aten use inp vs inp_expanded depending on if we're in max-autotune or not, matching the previous logic # what - pass unexpanded bias (inp) - let template (heuristics) that it to be expanded (ATen in not max-autotune, Triton always) expand it # testing ``` python3 -bb -m pytest test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_codegen_cpu ``` cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov Differential Revision: [D81520581](https://our.internmc.facebook.com/intern/diff/D81520581) [ghstack-poisoned]
|
@coconutruben has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
…erly" # why - addmm aten running with an expanded version of bias vs the regular bias sometimes causes numerics differences - to avoid this for now, we make addmm aten use inp vs inp_expanded depending on if we're in max-autotune or not, matching the previous logic # what - pass unexpanded bias (inp) - let template (heuristics) that it to be expanded (ATen in not max-autotune, Triton always) expand it # testing ``` python3 -bb -m pytest test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_codegen_cpu ``` cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov Differential Revision: [D81520581](https://our.internmc.facebook.com/intern/diff/D81520581) [ghstack-poisoned]
…erly" # why - addmm aten running with an expanded version of bias vs the regular bias sometimes causes numerics differences - to avoid this for now, we make addmm aten use inp vs inp_expanded depending on if we're in max-autotune or not, matching the previous logic # what - pass unexpanded bias (inp) - let template (heuristics) that it to be expanded (ATen in not max-autotune, Triton always) expand it # testing ``` python3 -bb -m pytest test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_codegen_cpu ``` cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov Differential Revision: [D81520581](https://our.internmc.facebook.com/intern/diff/D81520581) [ghstack-poisoned]
…erly" # why - addmm aten running with an expanded version of bias vs the regular bias sometimes causes numerics differences - to avoid this for now, we make addmm aten use inp vs inp_expanded depending on if we're in max-autotune or not, matching the previous logic # what - pass unexpanded bias (inp) - let template (heuristics) that it to be expanded (ATen in not max-autotune, Triton always) expand it # testing ``` python3 -bb -m pytest test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_codegen_cpu ``` cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov Differential Revision: [D81520581](https://our.internmc.facebook.com/intern/diff/D81520581) [ghstack-poisoned]
…erly" # why - addmm aten running with an expanded version of bias vs the regular bias sometimes causes numerics differences - to avoid this for now, we make addmm aten use inp vs inp_expanded depending on if we're in max-autotune or not, matching the previous logic # what - pass unexpanded bias (inp) - let template (heuristics) that it to be expanded (ATen in not max-autotune, Triton always) expand it # testing ``` python3 -bb -m pytest test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_codegen_cpu ``` cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov Differential Revision: [D81520581](https://our.internmc.facebook.com/intern/diff/D81520581) [ghstack-poisoned]
\# why - addmm aten running with an expanded version of bias vs the regular bias sometimes causes numerics differences - to avoid this for now, we make addmm aten use inp vs inp_expanded depending on if we're in max-autotune or not, matching the previous logic \# what - remove the view from inp when not in max-autotune for addmm aten \# testing ``` python3 -bb -m pytest test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_codegen_cpu ``` ghstack-source-id: 4399549 Pull Request resolved: #161534
…erly" # why - addmm aten running with an expanded version of bias vs the regular bias sometimes causes numerics differences - to avoid this for now, we make addmm aten use inp vs inp_expanded depending on if we're in max-autotune or not, matching the previous logic # what - pass unexpanded bias (inp) - let template (heuristics) that it to be expanded (ATen in not max-autotune, Triton always) expand it # testing ``` python3 -bb -m pytest test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_codegen_cpu ``` cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov Differential Revision: [D81520581](https://our.internmc.facebook.com/intern/diff/D81520581) [ghstack-poisoned]
\# why - addmm aten running with an expanded version of bias vs the regular bias sometimes causes numerics differences - to avoid this for now, we make addmm aten use inp vs inp_expanded depending on if we're in max-autotune or not, matching the previous logic \# what - remove the view from inp when not in max-autotune for addmm aten \# testing ``` python3 -bb -m pytest test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_codegen_cpu ``` ghstack-source-id: 208c906 Pull Request resolved: #161534
|
Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as |
\# why - addmm aten running with an expanded version of bias vs the regular bias sometimes causes numerics differences - to avoid this for now, we make addmm aten use inp vs inp_expanded depending on if we're in max-autotune or not, matching the previous logic \# what - expand KernelInputs to also store views of specific nodes, by names - use that view (inp, the unexpanded version) in the heuristics to adjust it depending on whether we're in max-autotune or not \# testing ``` python3 -bb -m pytest test/inductor/test_aot_inductor.py::AOTInductorTestABICompatibleCpu::test_aoti_debug_printer_codegen_cpu ``` ghstack-source-id: 1182f39 Pull Request resolved: pytorch/pytorch#161534
Stack from ghstack (oldest at bottom):
why
bias sometimes causes numerics differences
depending on if we're in max-autotune or not, matching the previous
logic
what
testing
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @chauhang @aakhundov @jataylo @chenyang78
Differential Revision: D81520581