[xpu][ut] Fix XPU CI failures#176057
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/176057
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (2 Unrelated Failures)As of commit 1ede193 with merge base 5a6d6b3 ( FLAKY - The following job failed but was likely due to flakiness present on trunk:
UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@pytorchbot rebase |
|
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
|
@pytorchbot rebase -b main |
|
@pytorchbot started a rebase job onto refs/remotes/origin/main. Check the current status here |
|
Rebase failed due to Command Raised by https://github.com/pytorch/pytorch/actions/runs/22558967383 |
|
The failure is unrelated to this PR. See the log https://github.com/pytorch/pytorch/actions/runs/22565777807/job/65406923173?pr=176057#step:14:1414, the fixed UT |
|
@pytorchbot drci |
| # can be resolved when the compiled kernel is unpickled from the | ||
| # compile subprocess back into the parent process. | ||
| if path_to_ext_heuristics not in sys.path: | ||
| sys.path.append(path_to_ext_heuristics) |
There was a problem hiding this comment.
If we mutate sys.path we need to cleanup the change after the test finishes. It also seems like the "custom imports" code below is now redundant.
|
Hi @jansel, All XPU UT passed. The current failure
The failure occurs because when We are currently investigating why this issue started occurring recently, as it did not fail previously. Once we narrow down the root cause, we will file a separate PR to address it. |
|
Hi @jansel, just wanted to check if you have any additional comments or concerns. |
|
Thanks! |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
# Motivation This PR aims to fix the CI failure on XPU. - `test_mm_plus_mm3` seems to only fail on CUDA. Already fixed in pytorch#175569 - `test_codegen_with_custom_heuristics_module` will fail on XPU due to `ModuleNotFoundError: No module named 'extension_triton_heuristics'`. This is because on CUDA CI, `is_parallel` is `False`, and on XPU CI, it is `True` due to a race condition. So we should add the path to `sys.path` in the parent process so that the` ExtensionCachingAutotuner` class can be resolved to boost the UT's robustness, whatever `is_parallel` it is. - skip `test_circular_dependencies` due to it is flaky on XPU, see pytorch#110040 https://github.com/pytorch/pytorch/blob/a88bb129e9d9e7572bc3a830ad5d148d74a63c48/torch/_inductor/async_compile.py#L385 # Additional Context fix pytorch#173473 fix pytorch#173344 fix pytorch#173916 fix pytorch#110040 Pull Request resolved: pytorch#176057 Approved by: https://github.com/jansel
Stack from ghstack (oldest at bottom):
Motivation
This PR aims to fix the CI failure on XPU.
test_mm_plus_mm3seems to only fail on CUDA. Already fixed in [Inductor][CUDA][test] Fix test_mm_plus_mm3_dynamic_shapes_gpu_wrapper on CUDA #175569test_codegen_with_custom_heuristics_modulewill fail on XPU due toModuleNotFoundError: No module named 'extension_triton_heuristics'. This is because on CUDA CI,is_parallelisFalse, and on XPU CI, it isTruedue to a race condition. So we should add the path tosys.pathin the parent process so that theExtensionCachingAutotunerclass can be resolved to boost the UT's robustness, whateveris_parallelit is.test_circular_dependenciesdue to it is flaky on XPU, see DISABLED test_circular_dependencies (__main__.TestImports) #110040pytorch/torch/_inductor/async_compile.py
Line 385 in a88bb12
Additional Context
fix #173473
fix #173344
fix #173916
fix #110040
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @jataylo