[Feature] Adding pip install Support for sgl-kernel for ROCm by RohitNagraj · Pull Request #14684 · sgl-project/sglang

RohitNagraj · 2025-12-09T01:08:43Z

Motivation

This PR aims to build and host sgl-kernel wheel for ROCm, which is a pre-requisite to build sglang wheel for ROCm.

Modifications

Added sgl-kernel/CMakeLists_rocm.txt that would be used by CMake for building wheel for ROCm, similar to NVIDIA's CMakeLists.txt. We use CMake over setup_rocm.py for building the wheel since the wheel built with CMake is Python version agnostic, whereas a wheel built with setup_rocm.py will only work with the Python version used to build it.
Added sgl-kernel/build_rocm.sh similar to NVIDIA's sgl-kernel/build.sh that builds the ROCm wheel inside a docker container (used by Github Workflows).
Added sgl-kernel/rename_wheels_rocm.sh similar to existing NVIDIA's sgl-kernel/rename_wheels.sh to rename wheels to the standard format. This script detects the rocm version used to build the wheel and renames it to the following template: sgl_kernel-<sglang_version>+rocm<rocm_version>-cp310-abi3-manylinux2014_x86_64.whl.
Added sgl-kernel/rocm_hipify.py that hipifies the sources using PyTorch's built in hipify module. This is required by CMake for build, as CMake expects HIP files to be compiled.
Updated .github/workflows/release-whl-kernel.yml to build and push ROCm 7.0 wheels to SGLang's kernel index.
Added ROCm support to scripts/ci/update_kernel_whl_index.py to update sgl-kernel wheel index.

Dependencies

This PR builds and releases sgl-kernel for ROCm 7.0, ensuring the package is available in the Releases section of github.com/sgl-project/whl.

However, for it to show up on the SGLang kernel wheel index completely, this small PR that adds the index file needs to be merged. Though this should not disrupt any functionality, as we only consume the URL for the package from the Releases section of github.com/sgl-project/whl

Usage Instructions

This PR adds support for building and releasing sgl-kernel to the SGLang Kernel Index.

Users are not expected to install sgl-kernel themselves, since it would be installed as a dependency by the SGLang wheel built with #14802. Users are only expected to pip install SGLang using the wheel built with #14802 and that should automatically install sgl-kernel as part of it.

However, if the user chooses to, they can install sgl-kernel on ROCm by choosing the specific ROCm version from SGLang kernel index and running the install command:

pip install sgl-kernel --index-url https://docs.sglang.io/whl/rocm700

Note: The sgl-kenel package is built with Torch 2.10.0 Nightly for ROCm 7.0 and the user must install the same torch version if they choose to install sgl-kernel using the pre-built wheel. Refer to the specific torch version in #14802 .

Maintenance

New kernel: When a new kernel is added, similar to how we currently update sources in sgl-kernel/setup_rocm.py, the sources will need to be updated in sgl-kernel/build_rocm.py and sgl-kernel/CMakeLists_rocm.txt.
New Torch Version: If we choose to update the Torch version used, the following changes are required:
1. Update build_rocm.sh: This file determines the torch version used to build sgl-kernel.
2. **Update pyproject_rocm.toml from [Feature] Add pyproject_rocm.toml for end-to-end ROCm pip installation support #14802:**The torch version specified in pyproject_rocm.toml is the version installed when user installs SGLang using the wheel. The versions for torchvision and pytorch-triton-rocm also need to be updated. To determine these, you can manually install the desired torch version, which would install compatible versions of torchvision and pytorch-triton-rocm, you can make a note of the compatible torchvision and pytorch-triton-rocm from here. Simply replace the versions of torch, 'torchvision, and pytorch-triton-rocm` with the new versions.
New ROCm version: If we want to update the Torch version and the new Torch version is built for a new ROCm version, a new wheel must be built by modifying .github/workflows/release-whl-kernel.yml, sgl-kernel/rename_wheels_rocm.sh, and sgl-kernel/build_rocm.sh.
New Architecture: To add support for a new architecture in the future, the AMDGPU_TARGET variable needs to be updated in sgl-kernel/build_rocm.sh. And any compiler flags specific to the architecture can be set in sgl-kernel/CMakeLists_rocm.txt.

Testing

Process

For testing the sgl-kernel, we build sgl-kernel wheel using desired ROCm and Torch version. Then, install sglang along with the same ROCm and Torch versions. Now, install sgl-kernel using the wheel built, and then run the following suite of tests (which is directly taken from the pr-test-amd.yml github workflow):

python3 -m pytest test_moe_align.py test_moe_topk_softmax.py speculative/test_eagle_utils.py test_apply_token_bitmask_inplace.py test_activation.py test_kvcacheio.py

Environments

The above tests were run on the following matrix of environments:
ROCm Versions: [7.0]
Python Versions: [3.10, 3.11, 3.12]
Hardware: [MI300x, MI350x]

Results

✅ All the unit tests run pass successfully.

Checklist

Format your code according to the Format code with pre-commit.
Add code support to build ROCm wheels for sgl-kernel
Test pip install functionality for ROCm 6.3 and ROCm 7.0 on MI300x and MI350x.
Add wheel release as part of the Github Workflow.
Test functionality on Python 3.10, 3.11, 3.12 with install from TestPyPI.
Update documentation according to Write documentations.

gemini-code-assist · 2025-12-09T01:08:46Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

HaiShaw · 2025-12-09T04:38:07Z

@RohitNagraj any readme on howto instructions?

HaiShaw · 2025-12-09T04:39:07Z

/tag-and-rerun-ci 12/10

RohitNagraj · 2025-12-09T04:58:13Z

@RohitNagraj any readme on howto instructions?

This PR builds only the sgl-kernel subpackage. Users are not expected to install sgl-kernel as it is a requirement of sglang itself.

On the other hand, for building sgl-kernel, the current method we have, using setup.py is easier for users.

yichiche · 2025-12-09T11:00:52Z

@RohitNagraj any readme on howto instructions?

This PR builds only the sgl-kernel subpackage. Users are not expected to install sgl-kernel as it is a requirement of sglang itself.

On the other hand, for building sgl-kernel, the current method we have, using setup.py is easier for users.

@RohitNagraj In what cases would we build the sgl-kernel subpackage on its own instead of using setup.py? If we modify any CUDA code or compiled scripts inside sgl-kernel, would running this Makefile be faster than performing a usual setup.py install?

And after upgrading the image in the future, under what circumstances would we need to update this Makefile?

RohitNagraj · 2025-12-10T05:42:21Z

@RohitNagraj any readme on howto instructions?

This PR builds only the sgl-kernel subpackage. Users are not expected to install sgl-kernel as it is a requirement of sglang itself.
On the other hand, for building sgl-kernel, the current method we have, using setup.py is easier for users.

@RohitNagraj In what cases would we build the sgl-kernel subpackage on its own instead of using setup.py? If we modify any CUDA code or compiled scripts inside sgl-kernel, would running this Makefile be faster than performing a usual setup.py install?

This is a great question. Building sgl-kernel subpackage and hosting it here would enable us to install sgl-kernel as a dependency inside pyproject.

setup.py can also be used to build the wheel and host the same. However, this would require building one wheel for every python version (which is a valid approach used by many packages). On the other hand, using a CMake based build creates a python version agnostic wheel (which is also what NVIDIA uses for SGLang).

And after upgrading the image in the future, under what circumstances would we need to update this Makefile?

The sources in the CMakeLists_rocm.txt and in rocm_hipify.py will have to be updated the same way we currently do it in setup_rocm.py.
CMakeLists_rocm.txt will need to be updated if we add a new architecture support or add new compilation flags.

akao-amd · 2025-12-12T05:30:00Z

Hi @RohitNagraj

(Together with [Feature] Add pyproject_rocm.toml for end-to-end ROCm pip installation support #14802) what will be the differences between I manually do pip uninstall sglang sgl-kernel && cd sgl-kernel && python setup_rocm.py install && cd .. && pip install -e "python[all_hip]" --no-deps myself? It is most of internal developer's usage, and I wonder the impact and benefit of these changes.
Do you expect to host the wheel packages on https://download.pytorch.org/whl/nightly/rocm7.0?

hubertlu-tw · 2025-12-12T06:25:16Z

You probably also need to modify the following scripts used in https://github.com/sgl-project/sglang/blob/main/.github/workflows/pr-test-amd.yml

scripts/ci/amd_ci_start_container.sh
scripts/ci/amd_ci_install_dependency.sh
Otherwise, our upstream CI will not be able to test your changes in the PR.

CC: @saienduri

RohitNagraj · 2025-12-12T20:06:53Z

Hi @RohitNagraj

(Together with [Feature] Add pyproject_rocm.toml for end-to-end ROCm pip installation support #14802) what will be the differences between I manually do pip uninstall sglang sgl-kernel && cd sgl-kernel && python setup_rocm.py install && cd .. && pip install -e "python[all_hip]" --no-deps myself? It is most of internal developer's usage, and I wonder the impact and benefit of these changes.

Do you expect to host the wheel packages on https://download.pytorch.org/whl/nightly/rocm7.0?

With the setup you mentioned, it builds the sgl-kernel and then installs. And for sglang, it does not install Torch. On the other hand, using this PR and [Feature] Add pyproject_rocm.toml for end-to-end ROCm pip installation support #14802 , we'd have a wheel built that contains all the dependencies required.
https://download.pytorch.org/whl/nightly/rocm7.0 is Pytorch's index. We don't have permission to host on Pytorch's index. sgl-kernel will be hosted on SGLang Kernel Index. We are yet to decide on where to host SGLang itself.

akao-amd · 2025-12-18T22:22:56Z

I attached aggregated test summaries from MI300 and MI355 runs. PIP Install Test Results.xlsx

Result: this change does not introduce new errors/regressions. Existing sgl-kernel tests appear unrelated to this PR, but CI errors should be fixed first.

Follow-ups (separate work items):

Validate and automate publishing of sgl-kernel ROCm wheels.
Improve dependency management in rocm.Dockerfile.
Continue driving [Feature] Add pyproject_rocm.toml for end-to-end ROCm pip installation support #14802 to enable pip install sglang for ROCm platforms.

Remove python/pyproject_rocm.toml and adjust docs/platforms/amd_gpu.md. These files were accidentally included from draft sgl-project#14802 and cause unnecessary cross-platform CI runs.

akao-amd · 2025-12-22T09:19:27Z

@RohitNagraj Would you help to drop this PR? #15627 is meant to replace this one.

akao-amd · 2026-01-09T01:39:33Z

As #15627 has been merged, I suggest closing this PR.

RohitNagraj requested review from BBuf, FlamingoPg, Fridge003, HaiShaw, Kangyan-Zhou, ispobock, merrymercy, yizhang2077 and zhyncs as code owners December 9, 2025 01:08

github-actions Bot added documentation Improvements or additions to documentation amd sgl-kernel labels Dec 9, 2025

github-actions Bot added the run-ci label Dec 9, 2025

sogalin reviewed Dec 9, 2025

View reviewed changes

Comment thread sgl-kernel/build_rocm.sh Outdated

This was referenced Dec 10, 2025

[Feature] Add pyproject_rocm.toml for end-to-end ROCm pip installation support #14802

Draft

[AMD] Adding index for rocm700 sgl-project/whl#12

Closed

RohitNagraj force-pushed the rocm-pip-install-dev branch from 14e0c8c to 4e676e6 Compare December 12, 2025 04:57

akao-amd force-pushed the rocm-pip-install-dev branch 2 times, most recently from 8106ed7 to c50f1c2 Compare December 18, 2025 22:19

github-actions Bot added the dependencies Pull requests that update a dependency file label Dec 18, 2025

RohitNagraj and others added 18 commits December 19, 2025 15:13

Added ROCm wheel build files for sgl-kernel

b349c28

Updated torch version used for rocm sgl-kernel wheel

69431d9

Removed redundant code

13a66ce

Added AMDGPU_TARGET env variable

e0b687b

Fixed indentation

20f4c0b

Renamed rocm630 to rocm640

a84f587

Updated wheel name from rocm 6.3 to 6.4

6f04705

Fixed merge conflict

2a2239c

Updated sources to match latest

f0e9d98

Updated env variable name to match the new change

aa26e9d

Added logic to create multiple directories for sglang/whl index

8ae26bc

Removed rocm640 support

79944d8

Added pyproject_rocm.toml

5a4ee3d

Updated docs

423bae9

Updated docs to remove PREBUILD_KERNELS=1 as default

4a3971d

Removed rocm640 support

9858cba

Align to torch/torchvision in current docker images

ed2e83c

Revert unintended ROCm packaging changes

3607d66

Remove python/pyproject_rocm.toml and adjust docs/platforms/amd_gpu.md. These files were accidentally included from draft sgl-project#14802 and cause unnecessary cross-platform CI runs.

akao-amd force-pushed the rocm-pip-install-dev branch from 8fc198a to 3607d66 Compare December 19, 2025 07:14

Merge branch 'main' into rocm-pip-install-dev

db17719

akao-amd mentioned this pull request Dec 22, 2025

[AMD] Add pip install / wheel build support for ROCm sgl-kernel #15627

Merged

6 tasks

HaiShaw closed this Jan 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Adding pip install Support for sgl-kernel for ROCm#14684

[Feature] Adding pip install Support for sgl-kernel for ROCm#14684
RohitNagraj wants to merge 19 commits intosgl-project:mainfrom
RohitNagraj:rocm-pip-install-dev

RohitNagraj commented Dec 9, 2025 •

edited

Loading

Uh oh!

gemini-code-assist Bot commented Dec 9, 2025

Uh oh!

HaiShaw commented Dec 9, 2025

Uh oh!

HaiShaw commented Dec 9, 2025 •

edited

Loading

Uh oh!

Uh oh!

RohitNagraj commented Dec 9, 2025

Uh oh!

yichiche commented Dec 9, 2025 •

edited

Loading

Uh oh!

RohitNagraj commented Dec 10, 2025

Uh oh!

akao-amd commented Dec 12, 2025

Uh oh!

hubertlu-tw commented Dec 12, 2025

Uh oh!

RohitNagraj commented Dec 12, 2025

Uh oh!

akao-amd commented Dec 18, 2025 •

edited

Loading

Uh oh!

akao-amd commented Dec 22, 2025

Uh oh!

akao-amd commented Jan 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

RohitNagraj commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Dependencies

Usage Instructions

Maintenance

Testing

Process

Environments

Results

Checklist

Uh oh!

gemini-code-assist Bot commented Dec 9, 2025

Uh oh!

HaiShaw commented Dec 9, 2025

Uh oh!

HaiShaw commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

RohitNagraj commented Dec 9, 2025

Uh oh!

yichiche commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

RohitNagraj commented Dec 10, 2025

Uh oh!

akao-amd commented Dec 12, 2025

Uh oh!

hubertlu-tw commented Dec 12, 2025

Uh oh!

RohitNagraj commented Dec 12, 2025

Uh oh!

akao-amd commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

akao-amd commented Dec 22, 2025

Uh oh!

akao-amd commented Jan 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

RohitNagraj commented Dec 9, 2025 •

edited

Loading

HaiShaw commented Dec 9, 2025 •

edited

Loading

yichiche commented Dec 9, 2025 •

edited

Loading

akao-amd commented Dec 18, 2025 •

edited

Loading