Skip to content

Commit f4448ca

Browse files
Shaoting-Fengclaude
andcommitted
[ROCm] Make bare-host ROCm install self-sufficient
Moves GPU-vendor-specific runtime deps out of common.txt into requirements/cuda_core.txt and requirements/rocm_core.txt. setup.py reads common.txt plus whichever core file matches BUILD_WITH_HIP so `pip install -e .` Just Works on both CUDA and ROCm hosts. - Drop cupy-cuda12x and nixl from common.txt (both are CUDA-only on PyPI; the nixl meta-package unconditionally pulls nixl-cu12, which installs nixl_ep/ and breaks ROCm runtime). - cuda.txt now -r cuda_core.txt so Dockerfile's `pip install -r cuda.txt` still pulls the same set. - Remove the [tool.setuptools.dynamic] dependencies block from pyproject.toml; install_requires is driven by setup.py now. - Add a second "Without vLLM docker base image" subsection to the ROCm install docs, mirroring the CUDA from-source flow line-for-line (uv venv -> -r build.txt -> torch from ROCm wheel index -> build). The existing rocm/vllm-dev flow stays as-is. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Shaoting Feng <shaotingf@uchicago.edu>
1 parent 69787b8 commit f4448ca

7 files changed

Lines changed: 74 additions & 11 deletions

File tree

docs/source/getting_started/installation.rst

Lines changed: 40 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -203,10 +203,13 @@ You can get the nightly build of latest code of LMcache and vLLM as follows:
203203
204204
205205
LMCache on ROCm
206-
------------------
206+
---------------
207+
208+
With vLLM docker base image
209+
~~~~~~~~~~~~~~~~~~~~~~~~~~~
207210

208211
Get started through using vLLM docker image as base image
209-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
212+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
210213

211214
The `AMD Infinity hub <https://hub.docker.com/r/rocm/vllm-dev>`__ for vLLM offers a prebuilt, optimized docker image designed for validating inference performance on the AMD Instinct™ MI300X accelerator.
212215
The image is based on the latest vLLM v1. Please check `LLM inference performance validation on AMD Instinct MI300X <https://rocm.docs.amd.com/en/latest/how-to/rocm-for-ai/inference/benchmark-docker/vllm.html?model=pyt_vllm_llama-3.1-8b>`__ for instructions on how to use this prebuilt docker image.
@@ -235,7 +238,7 @@ As of the date of writing, the steps are validated on the following environment:
235238
bash
236239
237240
Install Latest LMCache from Source for ROCm
238-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
241+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
239242

240243
To install from source, clone the repository and install in editable mode.
241244

@@ -255,4 +258,37 @@ Example on MI300X (gfx942):
255258
TORCH_DONT_CHECK_COMPILER_ABI=1 \
256259
CXX=hipcc \
257260
BUILD_WITH_HIP=1 \
258-
python3 -m pip install --no-build-isolation -e .
261+
python3 -m pip install --no-build-isolation -e .
262+
263+
264+
On a bare ROCm host
265+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
266+
267+
Install Latest LMCache from Source for ROCm
268+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
269+
270+
To install from source on a bare ROCm host (no ``rocm/vllm-dev`` base image),
271+
torch must be installed from the ROCm wheel index before building LMCache.
272+
This mirrors the CUDA from-source flow above, with the ROCm wheel index and
273+
HIP build flags in place of their CUDA equivalents.
274+
275+
.. code-block:: bash
276+
277+
git clone https://github.com/LMCache/LMCache.git
278+
cd LMCache
279+
280+
uv venv --python 3.12
281+
source .venv/bin/activate
282+
283+
# Need to install these packages manually to avoid build isolation
284+
uv pip install -r requirements/build.txt
285+
286+
# Install torch from the ROCm wheel index
287+
uv pip install torch torchvision --index-url https://download.pytorch.org/whl/rocm7.0
288+
289+
# Build LMCache. BUILD_WITH_HIP=1 makes setup.py pick cupy-rocm-7-0 automatically.
290+
PYTORCH_ROCM_ARCH="gfx942" \
291+
TORCH_DONT_CHECK_COMPILER_ABI=1 \
292+
CXX=hipcc \
293+
BUILD_WITH_HIP=1 \
294+
uv pip install -e . --no-build-isolation

pyproject.toml

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -57,9 +57,6 @@ version_file = "lmcache/_version.py"
5757
# do not include +gREV local version, required for Test PyPI upload
5858
local_scheme = "no-local-version"
5959

60-
[tool.setuptools.dynamic]
61-
dependencies = { file = ["requirements/common.txt"] }
62-
6360
[tool.setuptools.packages.find]
6461
where = [""]
6562
include = ["lmcache", "lmcache*"]

requirements/common.txt

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,6 @@ cufile-python
77
fastapi
88
httpx
99
msgspec
10-
# if nixl decides to support >=3.13 in the future, we can remove this constraint
11-
nixl; python_version < "3.13"
1210
# nixl uses numba which requires numpy<=2.2.6
1311
numpy<=2.2.6
1412
numba
@@ -42,5 +40,3 @@ torch
4240
transformers >= 4.51.1
4341
uvicorn
4442
httptools
45-
# Right now we are using cuda 12.x to align with serving engines
46-
cupy-cuda12x

requirements/cuda.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
# Common project dependencies
22
-r common.txt
3+
# Vendor-specific runtime deps (cupy, nixl) baked into install_requires
4+
-r cuda_core.txt
35

46
# Dependencies for NVIDIA GPUs
57
ray >= 2.9

requirements/cuda_core.txt

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
# Vendor-specific runtime deps baked into install_requires by setup.py
2+
# when building for CUDA (i.e. BUILD_WITH_HIP is unset).
3+
# Kept separate from cuda.txt so `pip install -e .` stays lightweight
4+
# (no ray/xformers/torchvision) while Docker's `pip install -r cuda.txt`
5+
# still pulls these through the -r chain.
6+
7+
cupy-cuda12x
8+
# nixl on PyPI is a meta-package that pulls nixl-cu12 (CUDA-only).
9+
nixl; python_version < "3.13"

requirements/rocm_core.txt

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
# Vendor-specific runtime deps baked into install_requires by setup.py
2+
# when building for ROCm (BUILD_WITH_HIP=1).
3+
# cupy-rocm-7-0 is the AMD analogue of cupy-cuda12x.
4+
# torch/torchvision are NOT listed here because ROCm wheels live on a
5+
# non-PyPI index (https://download.pytorch.org/whl/rocm7.0) that pip
6+
# cannot be told about via install_requires; users install them manually
7+
# per the "LMCache on ROCm / Without vLLM docker base image" docs.
8+
9+
cupy-rocm-7-0

setup.py

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,15 @@
2121
ENABLE_CXX11_ABI = os.environ.get("ENABLE_CXX11_ABI", "1") == "1"
2222

2323

24+
def _read_requirements(path: Path) -> list[str]:
25+
reqs: list[str] = []
26+
for raw in path.read_text().splitlines():
27+
line = raw.strip()
28+
if line and not line.startswith("#"):
29+
reqs.append(line)
30+
return reqs
31+
32+
2433
def hipify_wrapper() -> None:
2534
# Third Party
2635
from torch.utils.hipify.hipify_python import hipify
@@ -299,11 +308,16 @@ def source_dist_extension() -> tuple[list, dict]:
299308

300309
ext_modules, cmdclass = get_extension()
301310

311+
install_requires = _read_requirements(ROOT_DIR / "requirements" / "common.txt")
312+
core_file = "rocm_core.txt" if BUILD_WITH_HIP else "cuda_core.txt"
313+
install_requires += _read_requirements(ROOT_DIR / "requirements" / core_file)
314+
302315
setup(
303316
packages=find_packages(
304317
exclude=("csrc",)
305318
), # Ensure csrc is excluded if it only contains sources
306319
ext_modules=ext_modules,
307320
cmdclass=cmdclass,
308321
include_package_data=True,
322+
install_requires=install_requires,
309323
)

0 commit comments

Comments
 (0)