Skip to content

[ROCm] Make bare-host ROCm install self-sufficient#3070

Merged
ApostaC merged 1 commit intoLMCache:devfrom
Shaoting-Feng:rocm-install-self-sufficient
Apr 18, 2026
Merged

[ROCm] Make bare-host ROCm install self-sufficient#3070
ApostaC merged 1 commit intoLMCache:devfrom
Shaoting-Feng:rocm-install-self-sufficient

Conversation

@Shaoting-Feng
Copy link
Copy Markdown
Contributor

@Shaoting-Feng Shaoting-Feng commented Apr 17, 2026

Moves GPU-vendor-specific runtime deps out of common.txt into requirements/cuda_core.txt and requirements/rocm_core.txt. setup.py reads common.txt plus whichever core file matches BUILD_WITH_HIP so pip install -e . Just Works on both CUDA and ROCm hosts.

  • Drop cupy-cuda12x and nixl from common.txt (both are CUDA-only on PyPI; the nixl meta-package unconditionally pulls nixl-cu12, which installs nixl_ep/ and breaks ROCm runtime).
  • cuda.txt now -r cuda_core.txt so Dockerfile's pip install -r cuda.txt still pulls the same set.
  • Remove the [tool.setuptools.dynamic] dependencies block from pyproject.toml; install_requires is driven by setup.py now.
  • Add a second "Without vLLM docker base image" subsection to the ROCm install docs, mirroring the CUDA from-source flow line-for-line (uv venv -> -r build.txt -> torch from ROCm wheel index -> build). The existing rocm/vllm-dev flow stays as-is.

What this PR does / why we need it:

Special notes for your reviewers:

If applicable:

  • this PR contains user facing changes - docs added
  • this PR contains unit tests

Note

Medium Risk
Changes Python packaging metadata and install dependency resolution, which can break installs across CUDA/ROCm or sdist/wheel paths if the new requirement selection logic is wrong.

Overview
Makes LMCache installs choose the correct GPU-vendor runtime dependencies at build/install time instead of hardcoding CUDA-only packages for everyone.

Vendor-specific deps are split into new requirements/cuda_core.txt and requirements/rocm_core.txt, and setup.py now builds install_requires by reading requirements/common.txt plus the appropriate core file based on BUILD_WITH_HIP. cuda.txt is updated to include cuda_core.txt, and ROCm installation docs add a new bare host flow (install ROCm torch from the wheel index, then build with BUILD_WITH_HIP=1).

Reviewed by Cursor Bugbot for commit f4448ca. Bugbot is set up for automated code reviews on this repo. Configure here.

Moves GPU-vendor-specific runtime deps out of common.txt into
requirements/cuda_core.txt and requirements/rocm_core.txt. setup.py
reads common.txt plus whichever core file matches BUILD_WITH_HIP so
`pip install -e .` Just Works on both CUDA and ROCm hosts.

- Drop cupy-cuda12x and nixl from common.txt (both are CUDA-only on PyPI;
  the nixl meta-package unconditionally pulls nixl-cu12, which installs
  nixl_ep/ and breaks ROCm runtime).
- cuda.txt now -r cuda_core.txt so Dockerfile's `pip install -r cuda.txt`
  still pulls the same set.
- Remove the [tool.setuptools.dynamic] dependencies block from
  pyproject.toml; install_requires is driven by setup.py now.
- Add a second "Without vLLM docker base image" subsection to the
  ROCm install docs, mirroring the CUDA from-source flow line-for-line
  (uv venv -> -r build.txt -> torch from ROCm wheel index -> build).
  The existing rocm/vllm-dev flow stays as-is.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Shaoting Feng <shaotingf@uchicago.edu>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the dependency management system to support both CUDA and ROCm environments by dynamically setting install_requires in setup.py. It introduces vendor-specific requirement files and updates the ROCm installation documentation. Feedback includes correcting non-existent package versions and URLs for ROCm 7.0, and improving the robustness of the requirements parser to handle inline comments.

# cannot be told about via install_requires; users install them manually
# per the "LMCache on ROCm / Without vLLM docker base image" docs.

cupy-rocm-7-0
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The package cupy-rocm-7-0 does not appear to exist on PyPI. As of the current ROCm releases, the versions are typically in the 6.x range (e.g., cupy-rocm-6-0 or cupy-rocm-6-2). Using a non-existent version will cause the installation to fail. Please verify the correct ROCm version and update this requirement.

uv pip install -r requirements/build.txt

# Install torch from the ROCm wheel index
uv pip install torch torchvision --index-url https://download.pytorch.org/whl/rocm7.0
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The ROCm wheel index URL https://download.pytorch.org/whl/rocm7.0 seems incorrect. ROCm 7.0 has not been released yet; the current latest versions are in the 6.x range (e.g., rocm6.2). This URL will likely return a 404 error, breaking the installation instructions for users.

Comment thread setup.py
Comment on lines +24 to +30
def _read_requirements(path: Path) -> list[str]:
reqs: list[str] = []
for raw in path.read_text().splitlines():
line = raw.strip()
if line and not line.startswith("#"):
reqs.append(line)
return reqs
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The _read_requirements function is currently fragile as it does not handle inline comments (e.g., package >= 1.0 # comment). If such a comment is added to the requirements files, the entire string including the comment will be passed to install_requires, which will cause setuptools or pip to fail. It is safer to strip inline comments before processing.

Suggested change
def _read_requirements(path: Path) -> list[str]:
reqs: list[str] = []
for raw in path.read_text().splitlines():
line = raw.strip()
if line and not line.startswith("#"):
reqs.append(line)
return reqs
def _read_requirements(path: Path) -> list[str]:
reqs: list[str] = []
for raw in path.read_text().splitlines():
line = raw.split("#")[0].strip()
if line:
reqs.append(line)
return reqs

Copy link
Copy Markdown
Contributor

@ApostaC ApostaC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@ApostaC ApostaC enabled auto-merge (squash) April 17, 2026 22:37
@github-actions github-actions Bot added the full Run comprehensive tests on this PR label Apr 17, 2026
@ApostaC ApostaC merged commit cae196c into LMCache:dev Apr 18, 2026
67 of 69 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

full Run comprehensive tests on this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants