-
Notifications
You must be signed in to change notification settings - Fork 27.4k
Nightlies fail on a simple torch.linalg.solve() call with "undefined symbol: sgebak_" #94751
Copy link
Copy link
Closed
Labels
high prioritymodule: cudaRelated to torch.cuda, and CUDA support in generalRelated to torch.cuda, and CUDA support in generalmodule: linear algebraIssues related to specialized linear algebra operations in PyTorch; includes matrix multiply matmulIssues related to specialized linear algebra operations in PyTorch; includes matrix multiply matmultriage review
Milestone
Description
🐛 Describe the bug
The latest pytorch nightly failed torchbench drq train cuda test:
https://github.com/pytorch/benchmark/actions/runs/4164233087/jobs/7205604182
PyTorch version:
$ python -c "import torch; print(torch.__version__)"
2.0.0.dev20230213+cu117
Minimal reproduction:
$ python -c "import torch;A = torch.randn(200,200).cuda(); B = torch.randn(200).cuda(); torch.linalg.solve(A, B)"
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: Error in dlopen: /home/ubuntu/miniconda3/envs/torchbench-v2-nightly-ci/lib/python3.8/site-packages/torch/lib/libtorch_cuda_linalg.so: undefined symbol: sgebak_
Can we add a unit test for this?
Versions
Broken: 2.0.0.dev20230213+cu117
The previous nightly build (2.0.0.dev20230212+cu117) is good.
cc @ezyang @gchanan @zou3519 @ngimel @jianyuh @nikitaved @pearu @mruberry @walterddr @IvanYashchuk @xwang233 @lezcano
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
high prioritymodule: cudaRelated to torch.cuda, and CUDA support in generalRelated to torch.cuda, and CUDA support in generalmodule: linear algebraIssues related to specialized linear algebra operations in PyTorch; includes matrix multiply matmulIssues related to specialized linear algebra operations in PyTorch; includes matrix multiply matmultriage review