[Feature] Adding pip install Support for sgl-kernel for ROCm#14684
[Feature] Adding pip install Support for sgl-kernel for ROCm#14684RohitNagraj wants to merge 19 commits intosgl-project:mainfrom
Conversation
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
|
@RohitNagraj any readme on howto instructions? |
|
/tag-and-rerun-ci 12/10 |
This PR builds only the sgl-kernel subpackage. Users are not expected to install sgl-kernel as it is a requirement of sglang itself. On the other hand, for building sgl-kernel, the current method we have, using setup.py is easier for users. |
@RohitNagraj In what cases would we build the sgl-kernel subpackage on its own instead of using setup.py? If we modify any CUDA code or compiled scripts inside sgl-kernel, would running this Makefile be faster than performing a usual setup.py install? And after upgrading the image in the future, under what circumstances would we need to update this Makefile? |
This is a great question. Building
|
14e0c8c to
4e676e6
Compare
|
Hi @RohitNagraj
|
|
You probably also need to modify the following scripts used in https://github.com/sgl-project/sglang/blob/main/.github/workflows/pr-test-amd.yml
CC: @saienduri |
|
8106ed7 to
c50f1c2
Compare
|
I attached aggregated test summaries from MI300 and MI355 runs. PIP Install Test Results.xlsx Result: this change does not introduce new errors/regressions. Existing sgl-kernel tests appear unrelated to this PR, but CI errors should be fixed first. Follow-ups (separate work items):
|
Remove python/pyproject_rocm.toml and adjust docs/platforms/amd_gpu.md. These files were accidentally included from draft sgl-project#14802 and cause unnecessary cross-platform CI runs.
8fc198a to
3607d66
Compare
|
@RohitNagraj Would you help to drop this PR? #15627 is meant to replace this one. |
|
As #15627 has been merged, I suggest closing this PR. |
Motivation
This PR aims to build and host
sgl-kernelwheel for ROCm, which is a pre-requisite to build sglang wheel for ROCm.Modifications
sgl-kernel/CMakeLists_rocm.txtthat would be used by CMake for building wheel for ROCm, similar to NVIDIA'sCMakeLists.txt. We use CMake oversetup_rocm.pyfor building the wheel since the wheel built with CMake is Python version agnostic, whereas a wheel built withsetup_rocm.pywill only work with the Python version used to build it.sgl-kernel/build_rocm.shsimilar to NVIDIA'ssgl-kernel/build.shthat builds the ROCm wheel inside a docker container (used by Github Workflows).sgl-kernel/rename_wheels_rocm.shsimilar to existing NVIDIA'ssgl-kernel/rename_wheels.shto rename wheels to the standard format. This script detects the rocm version used to build the wheel and renames it to the following template:sgl_kernel-<sglang_version>+rocm<rocm_version>-cp310-abi3-manylinux2014_x86_64.whl.sgl-kernel/rocm_hipify.pythat hipifies the sources using PyTorch's built in hipify module. This is required by CMake for build, as CMake expects HIP files to be compiled..github/workflows/release-whl-kernel.ymlto build and push ROCm 7.0 wheels to SGLang's kernel index.scripts/ci/update_kernel_whl_index.pyto update sgl-kernel wheel index.Dependencies
This PR builds and releases
sgl-kernelfor ROCm 7.0, ensuring the package is available in the Releases section of github.com/sgl-project/whl.However, for it to show up on the SGLang kernel wheel index completely, this small PR that adds the index file needs to be merged. Though this should not disrupt any functionality, as we only consume the URL for the package from the Releases section of github.com/sgl-project/whl
Usage Instructions
This PR adds support for building and releasing
sgl-kernelto the SGLang Kernel Index.Users are not expected to install
sgl-kernelthemselves, since it would be installed as a dependency by the SGLang wheel built with #14802. Users are only expected topip installSGLang using the wheel built with #14802 and that should automatically installsgl-kernelas part of it.However, if the user chooses to, they can install
sgl-kernelon ROCm by choosing the specific ROCm version from SGLang kernel index and running the install command:Note: The
sgl-kenelpackage is built with Torch 2.10.0 Nightly for ROCm 7.0 and the user must install the same torch version if they choose to installsgl-kernelusing the pre-built wheel. Refer to the specific torch version in #14802 .Maintenance
sgl-kernel/build_rocm.pyandsgl-kernel/CMakeLists_rocm.txt.build_rocm.sh: This file determines the torch version used to buildsgl-kernel.pyproject_rocm.tomlfrom [Feature] Add pyproject_rocm.toml for end-to-end ROCm pip installation support #14802:**The torch version specified inpyproject_rocm.tomlis the version installed when user installs SGLang using the wheel. The versions fortorchvisionandpytorch-triton-rocmalso need to be updated. To determine these, you can manually install the desiredtorchversion, which would install compatible versions oftorchvisionandpytorch-triton-rocm, you can make a note of the compatibletorchvisionandpytorch-triton-rocmfrom here. Simply replace the versions oftorch, 'torchvision, andpytorch-triton-rocm` with the new versions..github/workflows/release-whl-kernel.yml,sgl-kernel/rename_wheels_rocm.sh, andsgl-kernel/build_rocm.sh.AMDGPU_TARGETvariable needs to be updated insgl-kernel/build_rocm.sh. And any compiler flags specific to the architecture can be set insgl-kernel/CMakeLists_rocm.txt.Testing
Process
For testing the
sgl-kernel, we buildsgl-kernelwheel using desired ROCm and Torch version. Then, install sglang along with the same ROCm and Torch versions. Now, installsgl-kernelusing the wheel built, and then run the following suite of tests (which is directly taken from thepr-test-amd.ymlgithub workflow):Environments
The above tests were run on the following matrix of environments:
ROCm Versions: [7.0]
Python Versions: [3.10, 3.11, 3.12]
Hardware: [MI300x, MI350x]
Results
✅ All the unit tests run pass successfully.
Checklist
sgl-kernel