Skip to content

GNU and LLVM OpenMP conflict when using MKL and KINETO under conda build #51026

@pearu

Description

@pearu

🐛 Bug

When building pytorch with MKL and KINETO enabled under conda environment, importing torch fails with the message:

$ python -c 'import torch'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/pearu/git/pytorch/pytorch/torch/__init__.py", line 192, in <module>
    from torch._C import *
ImportError: /home/pearu/miniconda3/envs/pytorch-cuda-dev/lib/libgomp.so.1: version `OACC_2.0' not found (required by /home/pearu/git/pytorch/pytorch/torch/lib/libtorch_cpu.so)

To Reproduce

Steps to reproduce the behavior:

  1. Setup conda environment using conda env create --file=pytorch-cuda-dev.yaml where pytorch-cuda-dev.yaml:
Details
name: pytorch-cuda-dev
channels:
  - conda-forge
  - pytorch
  - defaults
dependencies:
  - python=3
  - numpy
  - ninja
  - pyyaml
  - mkl
  - mkl-include
  - setuptools
  - cmake
  - cffi
  - typing
  - pytest
  - compilers
  - flake8
  - psutil
  - hypothesis
  - nvcc_linux-64=11.0
  - magma-cuda110
  - mypy
  - clang-tools
  1. Build PyTorch using python setup.py develop
  2. Try import torch

Expected behavior

import torch should succeed.

Environment

Output of collect_env.py:

Details
Collecting environment information...
PyTorch version: N/A
Is debug build: N/A
CUDA used to build PyTorch: N/A (pearu: it is 11.0.3)
ROCM used to build PyTorch: N/A

OS: Ubuntu 18.04.4 LTS (x86_64)
GCC version: (crosstool-NG 1.24.0.133_b0863d8_dirty) 9.3.0
Clang version: 11.0.1 (https://github.com/conda-forge/clangdev-feedstock 9fbb64b62ff49e9d206a06e62453b27557b3ed73)
CMake version: version 3.19.3

Python version: 3.9 (64-bit runtime)
Is CUDA available: N/A
CUDA runtime version: Could not collect
GPU models and configuration: 
GPU 0: GeForce RTX 2060 SUPER
GPU 1: GeForce RTX 2060 SUPER

Nvidia driver version: 460.27.04
cuDNN version: Probably one of the following:
/usr/local/cuda-10.1.243/targets/x86_64-linux/lib/libcudnn.so.7
/usr/local/cuda-10.2.89/targets/x86_64-linux/lib/libcudnn.so.7
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.19.5
[pip3] torch==1.8.0a0+unknown
[conda] magma-cuda110             2.5.2                         1    pytorch
[conda] mkl                       2020.4             h726a3e6_304    conda-forge
[conda] mkl-include               2020.4             h726a3e6_304    conda-forge
[conda] numpy                     1.19.5           py39hdbf815f_1    conda-forge
[conda] torch                     1.8.0a0+unknown           dev_0    <develop>

The details of conda environment are described in conda-forge/ctng-compilers-feedstock#49

Additional context

Notice that the MKL conda package requires Intel/LLVM OpenMP while PyTorch KINETO component links against ${CUDA_SOURCE_DIR}/extras/CUPTI/lib64/libcupti_static.a, see

set(CUDA_cupti_LIBRARY "${CUDA_SOURCE_DIR}/extras/CUPTI/lib64/libcupti_static.a")
that references acc_get_device_type@@OACC_2.0 which is defined only in GNU OpenMP.

Workaround

As a workaround, one can build an importable torch by disabling KINETO: USE_KINETO=0. Btw, building without MKL also leads to an importable torch.

Possible solution

To keep KINETO enabled, one could use USE_CUPTI_SO=1. However, when the static cupti library exists, USE_CUPTI_SO is ignored, see

if(EXISTS ${CUDA_SOURCE_DIR}/extras/CUPTI/lib64/libcupti_static.a)
set(CUDA_cupti_LIBRARY "${CUDA_SOURCE_DIR}/extras/CUPTI/lib64/libcupti_static.a")
elseif(EXISTS ${CUDA_SOURCE_DIR}/lib64/libcupti_static.a)
set(CUDA_cupti_LIBRARY "${CUDA_SOURCE_DIR}/lib64/libcupti_static.a")
elseif(USE_CUPTI_SO)
if(EXISTS ${CUDA_SOURCE_DIR}/extras/CUPTI/lib64/libcupti.so)
set(CUDA_cupti_LIBRARY "${CUDA_SOURCE_DIR}/extras/CUPTI/lib64/libcupti.so")
elseif(EXISTS ${CUDA_SOURCE_DIR}/lib64/libcupti.so)
set(CUDA_cupti_LIBRARY "${CUDA_SOURCE_DIR}/lib64/libcupti.so")
endif()
endif()

With

export USE_CUPTI_SO=1
export LDFLAGS="${LDFLAGS} -Wl,-rpath-link,${CUDA_HOME}/extras/CUPTI/lib64 -L${CUDA_HOME}/extras/CUPTI/lib64"
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:${CUDA_HOME}/extras/CUPTI/lib64

and applying the following patch

Details
diff --git a/cmake/Dependencies.cmake b/cmake/Dependencies.cmake
index e138a86c61..d9ab896bfa 100644
--- a/cmake/Dependencies.cmake
+++ b/cmake/Dependencies.cmake
@@ -1812,16 +1812,16 @@ if(USE_KINETO)
   message(STATUS "  KINETO_LIBRARY_TYPE = ${KINETO_LIBRARY_TYPE}")
   message(STATUS "  CUDA_SOURCE_DIR = ${CUDA_SOURCE_DIR}")
 
-  if(EXISTS ${CUDA_SOURCE_DIR}/extras/CUPTI/lib64/libcupti_static.a)
-    set(CUDA_cupti_LIBRARY "${CUDA_SOURCE_DIR}/extras/CUPTI/lib64/libcupti_static.a")
-  elseif(EXISTS ${CUDA_SOURCE_DIR}/lib64/libcupti_static.a)
-    set(CUDA_cupti_LIBRARY "${CUDA_SOURCE_DIR}/lib64/libcupti_static.a")
-  elseif(USE_CUPTI_SO)
+  if(USE_CUPTI_SO)
     if(EXISTS ${CUDA_SOURCE_DIR}/extras/CUPTI/lib64/libcupti.so)
       set(CUDA_cupti_LIBRARY "${CUDA_SOURCE_DIR}/extras/CUPTI/lib64/libcupti.so")
     elseif(EXISTS ${CUDA_SOURCE_DIR}/lib64/libcupti.so)
       set(CUDA_cupti_LIBRARY "${CUDA_SOURCE_DIR}/lib64/libcupti.so")
     endif()
+  elseif(EXISTS ${CUDA_SOURCE_DIR}/extras/CUPTI/lib64/libcupti_static.a)
+    set(CUDA_cupti_LIBRARY "${CUDA_SOURCE_DIR}/extras/CUPTI/lib64/libcupti_static.a")
+  elseif(EXISTS ${CUDA_SOURCE_DIR}/lib64/libcupti_static.a)
+    set(CUDA_cupti_LIBRARY "${CUDA_SOURCE_DIR}/lib64/libcupti_static.a")
   endif()
 
   if(EXISTS ${CUDA_SOURCE_DIR}/extras/CUPTI/include)

I was able to build an importable pytorch with both MKL and KINETO components enabled.

cc @ezyang @gchanan @zou3519 @bdhirsh @jbschlosser @malfet @seemethere @walterddr

Metadata

Metadata

Labels

high prioritymodule: buildBuild system issuesmodule: mklRelated to our MKL supportmodule: openmpRelated to OpenMP (omp) support in PyTorchopen sourcetriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions