Split toml files for sglang package by ZailiWang · Pull Request #10162 · sgl-project/sglang

ZailiWang · 2025-09-08T10:36:43Z

Motivation

Split the .toml files for sglang package, to better facilitate the planned per-device sgl-whl building.

Modifications

Simplified the default pyproject.toml file for building whls only supporting cuda, created pyproject_[cpu|hip|xpu|hpu|npu].toml files for other devices;
Removed the [project.optional-dependencies] hierarchical dependencies. Only 1 dependencies list in each .toml file (except test related packages);
Commands in the doc & Dockerfiles require updating. Updated CPU for now, may collaborate to update that of other devices with the respective admins.

Accuracy Tests

N/A

Benchmarking and Profiling

N/A

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.

…file & doc

gemini-code-assist

Summary of Changes

Hello @ZailiWang, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refactors the dependency management for the sglang package by introducing separate pyproject.toml files for different hardware platforms (CPU, HIP, XPU, HPU, NPU). This change aims to simplify the build process for per-device wheels and improve clarity by isolating platform-specific dependencies. It also involves updating relevant build scripts and documentation to align with the new structure.

Highlights

Modularized Dependency Management: The sglang package's pyproject.toml files have been split into device-specific configurations (e.g., pyproject_cpu.toml, pyproject_hip.toml) to streamline per-device wheel building.
Simplified Default Configuration: The main pyproject.toml now focuses on CUDA-only builds, with other device-specific dependencies moved to their respective .toml files.
Flattened Dependencies: Hierarchical [project.optional-dependencies] have been removed, with each .toml file now containing a single, flat dependencies list (excluding test-related packages).
Documentation and Dockerfile Updates: Installation commands in documentation and Dockerfiles have been updated to reflect the new .toml structure, specifically for CPU, with plans for other devices.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request effectively refactors the project's packaging configuration by splitting the pyproject.toml file into device-specific versions. This is a great step towards simplifying the build process for different hardware backends like CUDA, CPU, HIP, etc. The corresponding updates to the Dockerfile and documentation are consistent with these changes. My review includes a critical fix for the CPU installation documentation and several suggestions to improve the maintainability of the new pyproject files by sorting dependencies.

gemini-code-assist · 2025-09-08T10:38:57Z


 # Install SGLang dependent libs, and build SGLang main package
 pip install --upgrade pip setuptools
 conda install -y libsqlite==3.48.0 gperftools tbb libnuma numactl


The installation instructions for bare metal are missing the step to install torch. Since torch is not a dependency in pyproject_cpu.toml, pip install . will not install it, and the setup will be incomplete. It's best to install torch after the conda packages and before installing sglang.

Suggested change

conda install -y libsqlite==3.48.0 gperftools tbb libnuma numactl

conda install -y libsqlite==3.48.0 gperftools tbb libnuma numactl

pip install torch torchvision torchaudio

gemini-code-assist · 2025-09-08T10:38:57Z

+	cp pyproject_cpu.toml pyproject.toml && \
+    pip install . && \
    pip install torch==${VER_TORCH} torchvision==${VER_TORCHVISION} triton==${VER_TRITON} --force-reinstall && \
-    cd sgl-kernel && \
+    cd ../sgl-kernel && \
    cp pyproject_cpu.toml pyproject.toml && \
-    pip install -v .
+    pip install .


The installation process can be slightly optimized by combining the pip install commands. This reduces the number of layers in the Docker image and can make builds slightly faster.

cp pyproject_cpu.toml pyproject.toml && \ pip install . torch==${VER_TORCH} torchvision==${VER_TORCHVISION} triton==${VER_TRITON} --force-reinstall && \ cd ../sgl-kernel && \ cp pyproject_cpu.toml pyproject.toml && \ pip install .

gemini-code-assist · 2025-09-08T10:38:57Z

+    "aiohttp",
+	"anthropic>=0.20.0",
+    "blobfile==3.0.0",
+    "build",
+    "compressed-tensors",
+    "datasets",
+	"decord",
+    "einops",
+    "fastapi",
+    "hf_transfer",
+    "huggingface_hub",
+    "interegular",
+	"IPython",
+    "llguidance>=0.7.11,<0.8.0",
+    "modelscope",
+    "msgspec",
+    "ninja",
+	"numpy",
+    "openai==1.99.1",
+    "openai-harmony==0.0.4",
+    "orjson",
+    "outlines==0.1.11",
+    "packaging",
+    "partial_json_parser",
+	"petit_kernel==0.0.2",
+    "pillow",
+    "prometheus-client>=0.20.0",
+    "psutil",
+    "pybase64",
+    "pydantic",
+    "pynvml",
+    "python-multipart",
+    "pyzmq>=25.1.2",
+	"requests",
+	"scipy",
+    "sentencepiece",
+	"setproctitle",
+	"soundfile==0.13.1",
+    "timm==1.0.16",
+    "tiktoken",
+	"torch",
+    "torchao==0.9.0",
+	"tqdm",
+    "transformers==4.56.0",
+    "uvicorn",
+    "uvloop",
+	"wave-lang==3.7.0",
+    "xgrammar==0.1.23",


For better maintainability and to avoid potential issues like duplicate entries, it's a good practice to keep the list of dependencies sorted alphabetically.

"aiohttp", "anthropic>=0.20.0", "blobfile==3.0.0", "build", "compressed-tensors", "datasets", "decord", "einops", "fastapi", "hf_transfer", "huggingface_hub", "interegular", "IPython", "llguidance>=0.7.11,<0.8.0", "modelscope", "msgspec", "ninja", "numpy", "openai==1.99.1", "openai-harmony==0.0.4", "orjson", "outlines==0.1.11", "packaging", "partial_json_parser", "petit_kernel==0.0.2", "pillow", "prometheus-client>=0.20.0", "psutil", "pybase64", "pydantic", "pynvml", "python-multipart", "pyzmq>=25.1.2", "requests", "scipy", "sentencepiece", "setproctitle", "soundfile==0.13.1", "timm==1.0.16", "tiktoken", "torch", "torchao==0.9.0", "tqdm", "transformers==4.56.0", "uvicorn", "uvloop", "wave-lang==3.7.0", "xgrammar==0.1.23"

gemini-code-assist · 2025-09-08T10:38:57Z

+    "aiohttp",
+	"anthropic>=0.20.0",
+    "blobfile==3.0.0",
+    "build",
+    "compressed-tensors",
+    "datasets",
+	"decord",
+    "einops",
+    "fastapi",
+    "hf_transfer",
+    "huggingface_hub",
+    "interegular",
+	"IPython",
+    "llguidance>=0.7.11,<0.8.0",
+    "modelscope",
+    "msgspec",
+    "ninja",
+	"numpy",
+    "openai==1.99.1",
+    "openai-harmony==0.0.4",
+    "orjson",
+    "outlines==0.1.11",
+    "packaging",
+    "partial_json_parser",
+    "pillow",
+    "prometheus-client>=0.20.0",
+    "psutil",
+    "pybase64",
+    "pydantic",
+    "pynvml",
+    "python-multipart",
+    "pyzmq>=25.1.2",
+	"requests",
+	"scipy",
+    "sentencepiece",
+	"setproctitle",
+	"soundfile==0.13.1",
+    "timm==1.0.16",
+    "tiktoken",
+    "torchao==0.9.0",
+	"tqdm",
+    "transformers==4.56.0",
+    "uvicorn",
+    "uvloop",
+    "xgrammar==0.1.23",


For better maintainability and to avoid potential issues like duplicate entries, it's a good practice to keep the list of dependencies sorted alphabetically.

"aiohttp", "anthropic>=0.20.0", "blobfile==3.0.0", "build", "compressed-tensors", "datasets", "decord", "einops", "fastapi", "hf_transfer", "huggingface_hub", "interegular", "IPython", "llguidance>=0.7.11,<0.8.0", "modelscope", "msgspec", "ninja", "numpy", "openai==1.99.1", "openai-harmony==0.0.4", "orjson", "outlines==0.1.11", "packaging", "partial_json_parser", "pillow", "prometheus-client>=0.20.0", "psutil", "pybase64", "pydantic", "pynvml", "python-multipart", "pyzmq>=25.1.2", "requests", "scipy", "sentencepiece", "setproctitle", "soundfile==0.13.1", "timm==1.0.16", "tiktoken", "torchao==0.9.0", "tqdm", "transformers==4.56.0", "uvicorn", "uvloop", "xgrammar==0.1.23"

gemini-code-assist · 2025-09-08T10:38:58Z

+    "aiohttp",
+	"anthropic>=0.20.0",
+    "blobfile==3.0.0",
+    "build",
+    "compressed-tensors",
+    "datasets",
+	"decord",
+    "einops",
+    "fastapi",
+    "hf_transfer",
+    "huggingface_hub",
+    "interegular",
+	"IPython",
+    "llguidance>=0.7.11,<0.8.0",
+    "modelscope",
+    "msgspec",
+    "ninja",
+	"numpy",
+    "openai==1.99.1",
+    "openai-harmony==0.0.4",
+    "orjson",
+    "outlines==0.1.11",
+    "packaging",
+    "partial_json_parser",
+    "pillow",
+    "prometheus-client>=0.20.0",
+    "psutil",
+    "pybase64",
+    "pydantic",
+    "pynvml",
+    "python-multipart",
+    "pyzmq>=25.1.2",
+	"requests",
+	"scipy",
+    "sentencepiece",
+	"setproctitle",
+	"soundfile==0.13.1",
+    "timm==1.0.16",
+    "tiktoken",
+    "torchao==0.9.0",
+	"tqdm",
+    "transformers==4.56.0",
+    "uvicorn",
+    "uvloop",
+    "xgrammar==0.1.23",
+]


For better maintainability and to avoid potential issues like duplicate entries, it's a good practice to keep the list of dependencies sorted alphabetically.

"aiohttp", "anthropic>=0.20.0", "blobfile==3.0.0", "build", "compressed-tensors", "datasets", "decord", "einops", "fastapi", "hf_transfer", "huggingface_hub", "interegular", "IPython", "llguidance>=0.7.11,<0.8.0", "modelscope", "msgspec", "ninja", "numpy", "openai==1.99.1", "openai-harmony==0.0.4", "orjson", "outlines==0.1.11", "packaging", "partial_json_parser", "pillow", "prometheus-client>=0.20.0", "psutil", "pybase64", "pydantic", "pynvml", "python-multipart", "pyzmq>=25.1.2", "requests", "scipy", "sentencepiece", "setproctitle", "soundfile==0.13.1", "timm==1.0.16", "tiktoken", "torchao==0.9.0", "tqdm", "transformers==4.56.0", "uvicorn", "uvloop", "xgrammar==0.1.23"

hnyls2002 · 2025-09-10T05:31:21Z

It seems to be too complex and too much duplicate code sharing between different files. Do you have other methods to do this?

ZailiWang · 2025-09-10T06:39:46Z

Hi @hnyls2002 , the toml split was requested by @zhyncs . It indeed introduces duplications, but it seems this is inevitable if we need to split the package configs into device specific ones. I searched official pip install tutorials but failed to find anything about hierarchical config file support.

Let's keep the content of the PR as-is until we get a conclusion whether this is what we need, or we can find any better approaches.

ZailiWang · 2025-09-17T08:35:06Z

Will take care of CPU/XPU backends only

ZailiWang added 2 commits September 8, 2025 15:36

add cpu specific toml

9722e57

add device specific toml files for other platforms; update CPU docker…

4144571

…file & doc

ZailiWang requested review from ByronHsu, HaiShaw, merrymercy and zhyncs as code owners September 8, 2025 10:36

gemini-code-assist Bot reviewed Sep 8, 2025

View reviewed changes

ZailiWang closed this Sep 17, 2025

ZailiWang mentioned this pull request Sep 22, 2025

Dedicated toml files for CPU/XPU #10734

Merged

4 tasks

ZailiWang deleted the split-toml branch September 25, 2025 03:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split toml files for sglang package#10162

Split toml files for sglang package#10162
ZailiWang wants to merge 2 commits intosgl-project:mainfrom
ZailiWang:split-toml

ZailiWang commented Sep 8, 2025

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Sep 8, 2025

Uh oh!

gemini-code-assist Bot Sep 8, 2025

Uh oh!

Uh oh!

gemini-code-assist Bot Sep 8, 2025

Uh oh!

gemini-code-assist Bot Sep 8, 2025

Uh oh!

gemini-code-assist Bot Sep 8, 2025

Uh oh!

Uh oh!

hnyls2002 commented Sep 10, 2025

Uh oh!

ZailiWang commented Sep 10, 2025

Uh oh!

ZailiWang commented Sep 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	conda install -y libsqlite==3.48.0 gperftools tbb libnuma numactl
	conda install -y libsqlite==3.48.0 gperftools tbb libnuma numactl
	pip install torch torchvision torchaudio

Conversation

ZailiWang commented Sep 8, 2025

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gemini-code-assist Bot Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

hnyls2002 commented Sep 10, 2025

Uh oh!

ZailiWang commented Sep 10, 2025

Uh oh!

ZailiWang commented Sep 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants