-
Notifications
You must be signed in to change notification settings - Fork 599
Closed
Description
Bug summary
The build system fails to find libamdhip64.so despite setting ROCM_PATH and ROCM_ROOT. It searches for /opt/rocm/lib/libamdhip64.so, whereas ROCM_ROOT is /opt/rocm-6.0.0. I cannot create a sym-link to /opt/rocm.
Additionally, it looks like the PyTorch installed by pip has its own libamdhip64.so. Not sure if that one should be preferred.
DeePMD-kit Version
tag v3.0.0b1
Backend and its version
PyTorch 2.4.0+rocm6.0
How did you download the software?
Built from source
Input Files, Running Commands, Error Log, etc.
pip install .
Processing /autofs/home1/akashi/sources/deepmd-kit
Installing build dependencies ... done
Getting requirements to build wheel ... done
Installing backend dependencies ... done
Preparing metadata (pyproject.toml) ... done
Collecting numpy (from deepmd-kit==3.0.0b1)
Using cached numpy-2.0.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (60 kB)
Collecting scipy (from deepmd-kit==3.0.0b1)
Using cached scipy-1.14.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (60 kB)
Collecting pyyaml (from deepmd-kit==3.0.0b1)
Using cached PyYAML-6.0.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (2.1 kB)
Collecting dargs>=0.4.7 (from deepmd-kit==3.0.0b1)
Using cached dargs-0.4.8-py3-none-any.whl.metadata (11 kB)
Collecting h5py (from deepmd-kit==3.0.0b1)
Using cached h5py-3.11.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (2.5 kB)
Collecting wcmatch (from deepmd-kit==3.0.0b1)
Using cached wcmatch-8.5.2-py3-none-any.whl.metadata (4.8 kB)
Collecting packaging (from deepmd-kit==3.0.0b1)
Using cached packaging-24.1-py3-none-any.whl.metadata (3.2 kB)
Collecting ml_dtypes (from deepmd-kit==3.0.0b1)
Using cached ml_dtypes-0.4.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (20 kB)
Collecting mendeleev (from deepmd-kit==3.0.0b1)
Using cached mendeleev-0.17.0-py3-none-any.whl.metadata (20 kB)
Collecting array-api-compat (from deepmd-kit==3.0.0b1)
Using cached array_api_compat-1.7.1-py3-none-any.whl.metadata (1.5 kB)
Collecting typeguard>=4 (from dargs>=0.4.7->deepmd-kit==3.0.0b1)
Using cached typeguard-4.3.0-py3-none-any.whl.metadata (3.7 kB)
Collecting Pygments<3.0.0,>=2.11.2 (from mendeleev->deepmd-kit==3.0.0b1)
Using cached pygments-2.18.0-py3-none-any.whl.metadata (2.5 kB)
Collecting SQLAlchemy>=1.4.0 (from mendeleev->deepmd-kit==3.0.0b1)
Using cached SQLAlchemy-2.0.31-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (9.6 kB)
Collecting colorama<0.5.0,>=0.4.6 (from mendeleev->deepmd-kit==3.0.0b1)
Using cached colorama-0.4.6-py2.py3-none-any.whl.metadata (17 kB)
Collecting numpy (from deepmd-kit==3.0.0b1)
Using cached numpy-1.26.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (61 kB)
Collecting pandas<3.0,>=2.1 (from mendeleev->deepmd-kit==3.0.0b1)
Using cached pandas-2.2.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (19 kB)
Collecting pyfiglet<0.9,>=0.8.post1 (from mendeleev->deepmd-kit==3.0.0b1)
Using cached pyfiglet-0.8.post1-py2.py3-none-any.whl.metadata (1.3 kB)
Collecting bracex>=2.1.1 (from wcmatch->deepmd-kit==3.0.0b1)
Using cached bracex-2.4-py3-none-any.whl.metadata (3.6 kB)
Collecting python-dateutil>=2.8.2 (from pandas<3.0,>=2.1->mendeleev->deepmd-kit==3.0.0b1)
Using cached python_dateutil-2.9.0.post0-py2.py3-none-any.whl.metadata (8.4 kB)
Collecting pytz>=2020.1 (from pandas<3.0,>=2.1->mendeleev->deepmd-kit==3.0.0b1)
Using cached pytz-2024.1-py2.py3-none-any.whl.metadata (22 kB)
Collecting tzdata>=2022.7 (from pandas<3.0,>=2.1->mendeleev->deepmd-kit==3.0.0b1)
Using cached tzdata-2024.1-py2.py3-none-any.whl.metadata (1.4 kB)
Requirement already satisfied: typing-extensions>=4.6.0 in /lustre/world-share/stf218/akashi/miniconda3/envs/uq4mat_pt/lib/python3.12/site-packages (from SQLAlchemy>=1.4.0->mendeleev->deepmd-kit==3.
0.0b1) (4.9.0)
Collecting greenlet!=0.4.17 (from SQLAlchemy>=1.4.0->mendeleev->deepmd-kit==3.0.0b1)
Using cached greenlet-3.0.3-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.metadata (3.8 kB)
Collecting typing-extensions>=4.6.0 (from SQLAlchemy>=1.4.0->mendeleev->deepmd-kit==3.0.0b1)
Using cached typing_extensions-4.12.2-py3-none-any.whl.metadata (3.0 kB)
Collecting six>=1.5 (from python-dateutil>=2.8.2->pandas<3.0,>=2.1->mendeleev->deepmd-kit==3.0.0b1)
Using cached six-1.16.0-py2.py3-none-any.whl.metadata (1.8 kB)
Using cached dargs-0.4.8-py3-none-any.whl (26 kB)
Using cached array_api_compat-1.7.1-py3-none-any.whl (37 kB)
Using cached h5py-3.11.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.4 MB)
Using cached mendeleev-0.17.0-py3-none-any.whl (367 kB)
Using cached numpy-1.26.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.0 MB)
Using cached ml_dtypes-0.4.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.2 MB)
Using cached packaging-24.1-py3-none-any.whl (53 kB)
Using cached PyYAML-6.0.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (724 kB)
Using cached scipy-1.14.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (40.8 MB)
Using cached wcmatch-8.5.2-py3-none-any.whl (39 kB)
Using cached bracex-2.4-py3-none-any.whl (11 kB)
Using cached colorama-0.4.6-py2.py3-none-any.whl (25 kB)
Using cached pandas-2.2.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.7 MB)
Using cached pyfiglet-0.8.post1-py2.py3-none-any.whl (865 kB)
Using cached pygments-2.18.0-py3-none-any.whl (1.2 MB)
Using cached SQLAlchemy-2.0.31-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.2 MB)
Using cached typeguard-4.3.0-py3-none-any.whl (35 kB)
Using cached greenlet-3.0.3-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (625 kB)
Using cached python_dateutil-2.9.0.post0-py2.py3-none-any.whl (229 kB)
Using cached pytz-2024.1-py2.py3-none-any.whl (505 kB)
Using cached typing_extensions-4.12.2-py3-none-any.whl (37 kB)
Using cached tzdata-2024.1-py2.py3-none-any.whl (345 kB)
Using cached six-1.16.0-py2.py3-none-any.whl (11 kB)
Building wheels for collected packages: deepmd-kit
Building wheel for deepmd-kit (pyproject.toml) ... error
error: subprocess-exited-with-error
× Building wheel for deepmd-kit (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [74 lines of output]
*** scikit-build-core 0.8.2 using CMake 3.30.1 (wheel)
*** Configuring CMake...
2024-07-23 12:36:44,357 - scikit_build_core - WARNING - libdir/ldlibrary: /lustre/world-share/stf218/akashi/miniconda3/envs/uq4mat_pt/lib/libpython3.12.a is not a real file!
2024-07-23 12:36:44,357 - scikit_build_core - WARNING - Can't find a Python library, got libdir=/lustre/world-share/stf218/akashi/miniconda3/envs/uq4mat_pt/lib, ldlibrary=libpython3.12.a, mult
iarch=x86_64-linux-gnu, masd=None
loading initial cache file build/py37-none-manylinux_2_31_x86_64/CMakeInit.txt
-- Cray Programming Environment 2.7.31 C
-- Cray Programming Environment 2.7.31 CXX
-- Supported model version: 1.1
-- Will not build nv GPU support
-- The HIP compiler identification is Clang 17.0.0
-- Found ROCM in /opt/rocm-6.0.0, build AMD GPU support
/opt/rocm-6.0.0/bin/rocm_agent_enumerator:95: SyntaxWarning: invalid escape sequence '\w'
@staticVars(search_name=re.compile("gfx[0-9a-fA-F]+(:[-+:\w]+)?"))
/opt/rocm-6.0.0/bin/rocm_agent_enumerator:152: SyntaxWarning: invalid escape sequence '\A'
line_search_term = re.compile("\A\s+Name:\s+(amdgcn-amd-amdhsa--gfx\d+)")
/opt/rocm-6.0.0/bin/rocm_agent_enumerator:154: SyntaxWarning: invalid escape sequence '\A'
line_search_term = re.compile("\A\s+Name:\s+(gfx\d+)")
/opt/rocm-6.0.0/bin/rocm_agent_enumerator:175: SyntaxWarning: invalid escape sequence '\w'
target_search_term = re.compile("1002:\w+")
Building PyTorch for GPU arch: gfx90a
HIP VERSION: 6.0.32830-d62f6a171
-- Caffe2: Header version is: 6.0.0
***** ROCm version from rocm_version.h ****
ROCM_VERSION_DEV: 6.0.0
ROCM_VERSION_DEV_MAJOR: 6
ROCM_VERSION_DEV_MINOR: 0
ROCM_VERSION_DEV_PATCH: 0
ROCM_VERSION_DEV_INT: 60000
HIP_VERSION_MAJOR: 6
HIP_VERSION_MINOR: 0
TORCH_HIP_VERSION: 600
***** Library versions from dpkg *****
***** Library versions from cmake find_package *****
hip VERSION: 6.0.23494
hsa-runtime64 VERSION: 1.12.60000
amd_comgr VERSION: 2.6.0
rocrand VERSION: 2.10.17
hiprand VERSION: 2.10.16
rocblas VERSION: 4.0.0
hipblas VERSION: 2.0.0
hipblaslt VERSION: 0.6.0
miopen VERSION: 3.00.0
hipfft VERSION: 1.0.12
hipsparse VERSION: 3.0.0
rccl VERSION: 2.18.3
rocprim VERSION: 3.0.0
hipcub VERSION: 3.0.0
rocthrust VERSION: 3.0.0
hipsolver VERSION: 2.0.0
HIP is using new type enums
CMake Warning at /lustre/world-share/stf218/akashi/miniconda3/envs/uq4mat_pt/lib/python3.12/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:22 (message):
static library kineto_LIBRARY-NOTFOUND not found.
Call Stack (most recent call first):
/lustre/world-share/stf218/akashi/miniconda3/envs/uq4mat_pt/lib/python3.12/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:120 (append_torchlib_if_found)
CMakeLists.txt:189 (find_package)
-- PyTorch CXX11 ABI: 0
-- Enabled backends:
-- - PyTorch
-- HIP major version is 6
-- Configuring done (3.8s)
-- Generating done (0.1s)
-- Build files have been written to: /home/akashi/sources/deepmd-kit/build/py37-none-manylinux_2_31_x86_64
*** Building project with Ninja...
ninja: error: '/opt/rocm/lib/libamdhip64.so', needed by 'op/pt/libdeepmd_op_pt.so', missing and no known rule to make it
*** CMake build failed
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for deepmd-kit
Failed to build deepmd-kit
ERROR: Could not build wheels for deepmd-kit, which is required to install pyproject.toml-based projects
Steps to Reproduce
With ROCm installed on anything other than /opt/rocm, attempt to build DeePMD with Pytorch backend on AMD system from source as detailed here: https://docs.deepmodeling.com/projects/deepmd/en/v3.0.0b1/install/install-from-source.html
Further Information, Files, and Links
No response
Reactions are currently unavailable