Skip to content

Nightlies fail on a simple torch.linalg.solve() call with "undefined symbol: sgebak_" #94751

@xuzhao9

Description

@xuzhao9

🐛 Describe the bug

The latest pytorch nightly failed torchbench drq train cuda test:
https://github.com/pytorch/benchmark/actions/runs/4164233087/jobs/7205604182

PyTorch version:

$ python -c "import torch; print(torch.__version__)"
2.0.0.dev20230213+cu117

Minimal reproduction:

$ python -c "import torch;A = torch.randn(200,200).cuda(); B = torch.randn(200).cuda(); torch.linalg.solve(A, B)"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: Error in dlopen: /home/ubuntu/miniconda3/envs/torchbench-v2-nightly-ci/lib/python3.8/site-packages/torch/lib/libtorch_cuda_linalg.so: undefined symbol: sgebak_

Can we add a unit test for this?

Versions

Broken: 2.0.0.dev20230213+cu117

The previous nightly build (2.0.0.dev20230212+cu117) is good.

cc @ezyang @gchanan @zou3519 @ngimel @jianyuh @nikitaved @pearu @mruberry @walterddr @IvanYashchuk @xwang233 @lezcano

Metadata

Metadata

Labels

high prioritymodule: cudaRelated to torch.cuda, and CUDA support in generalmodule: linear algebraIssues related to specialized linear algebra operations in PyTorch; includes matrix multiply matmultriage review

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions