Skip to content

Build broken on ubuntu due to MKL symbols not found #50211

@antocuni

Description

@antocuni

🐛 Bug

On my system, pytorch builds fine, but using MKL function fails:

$ python -c 'import torch; torch.tensor([[1.0, 2], [3, 4]]).svd()'
INTEL MKL ERROR: /home/antocuni/.conda/envs/pytorch-cuda-dev/lib/libmkl_def.so: undefined symbol: mkl_sparse_optimize_bsr_trsm_i8.
Intel MKL FATAL ERROR: Cannot load libmkl_def.so.

To Reproduce

Steps to reproduce the behavior:

  1. python setup.py develop
  2. python -c 'import torch; torch.tensor([[1.0, 2], [3, 4]]).svd()'

Depending on which MKL operations is used, the undefined symbol varies.

Expected behavior

Not to crash :)

Environment

I think that the relevant env variable is LDFLAGS:

$ echo $LDFLAGS
-Wl,-O2 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now -Wl,--disable-new-dtags -Wl,--gc-sections -Wl,-rpath,/home/antocuni/.conda/envs/pytorch-cuda-dev/lib -Wl,-rpath-link,/home/antocuni/.conda/envs/pytorch-cuda-dev/lib -L/home/antocuni/.conda/envs/pytorch-cuda-dev/lib -Wl,-rpath-link,/usr/local/cuda-11.0.3/lib64 -L/usr/local/cuda-11.0.3/lib64

Note that it contains --as-needed.

I bisected the commits and the culprit seems to be 12ee7b6, added by #50080 . The problem is that if LDFLAGS contains --as-needed, it overrides the --no-as-needed which is added by the if.
An easy workaround is to manually add --no-as-needed at the end of my LDFLAGS, but I claim that since the if clearly tries to take care of it, the PR is wrong.

cc @malfet @seemethere @walterddr

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: buildBuild system issuestriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions