[MUSA][1/N] sglang.check_env by yeahdongcn · Pull Request #16959 · sgl-project/sglang

yeahdongcn · 2026-01-12T12:58:36Z

Motivation

This PR is the first in a series of pull requests (tracked in #16565) to add full support for Moore Threads GPUs, leveraging MUSA (Meta-computing Unified System Architecture) to accelerate LLM inference.

Modifications

Added is_musa to check the basic runtime environment
Updated check_env.py to fetch the device info, driver version, topology, etc.
bidict is added in both pyproject_other.toml and pyproject.toml for futher handling cuda_wrapper.py and pynccl_wrapper.py

Testing Done

Tested in a clean torch_musa container.

root@worker3218:/ws# rm -f python/pyproject.toml && mv python/pyproject_other.toml python/pyproject.toml
root@worker3218:/ws# pip install -e "python[all_musa]"
root@worker3218:/ws# python3 -m sglang.check_env
Python: 3.10.12 (main, Nov  4 2025, 08:48:33) [GCC 11.4.0]
MUSA available: True
GPU 0,1,2,3,4,5,6,7: MTT S5000
GPU 0,1,2,3,4,5,6,7 Compute Capability: 3.1
MUSA_HOME: /usr/local/musa
MCC: mcc version 4.3.4
MUSA Driver Version: 3.3.3-server
PyTorch: 2.7.1
sglang: 0.1.dev8833+g2dadf6356.d20260112
sgl_kernel: Module Not Found
flashinfer_python: Module Not Found
flashinfer_cubin: Module Not Found
flashinfer_jit_cache: Module Not Found
triton: 3.1.0
transformers: 4.57.1
torchao: 0.9.0
numpy: 1.26.4
aiohttp: 3.13.3
fastapi: 0.123.5
hf_transfer: 0.1.9
huggingface_hub: 0.36.0
interegular: 0.3.3
modelscope: 1.33.0
orjson: 3.11.5
outlines: 0.1.11
packaging: 25.0
psutil: 7.2.1
pydantic: 2.12.5
python-multipart: 0.0.21
pyzmq: 27.1.0
uvicorn: 0.38.0
uvloop: 0.22.1
vllm: Module Not Found
xgrammar: 0.1.27
openai: 2.6.1
tiktoken: 0.12.0
anthropic: 0.75.0
litellm: Module Not Found
decord2: 3.0.0
MTHREADS Topology: 
         GPU0     GPU1     GPU2     GPU3     GPU4     GPU5     GPU6     GPU7     NIC0     NIC1     NIC2     NIC3     NIC4     NIC5     NIC6     NIC7     NIC8     NIC9     NIC10    CPU Affinity   NUMA Affinity  
GPU0     X        MT2      MT2      MT2      MT2      MT2      MT2      MT2      MPB      MPB      NODE     NODE     SYS      SYS      SYS      SYS      SYS      SYS      NODE     0-31,64-95     0              
GPU1     MT2      X        MT2      MT2      MT2      MT2      MT2      MT2      NODE     NODE     NODE     NODE     SYS      SYS      SYS      SYS      SYS      SYS      NODE     0-31,64-95     0              
GPU2     MT2      MT2      X        MT2      MT2      MT2      MT2      MT2      NODE     NODE     MPB      MPB      SYS      SYS      SYS      SYS      SYS      SYS      NODE     0-31,64-95     0              
GPU3     MT2      MT2      MT2      X        MT2      MT2      MT2      MT2      NODE     NODE     NODE     NODE     SYS      SYS      SYS      SYS      SYS      SYS      NODE     0-31,64-95     0              
GPU4     MT2      MT2      MT2      MT2      X        MT2      MT2      MT2      SYS      SYS      SYS      SYS      NODE     NODE     MPB      MPB      NODE     NODE     SYS      32-63,96-127   1              
GPU5     MT2      MT2      MT2      MT2      MT2      X        MT2      MT2      SYS      SYS      SYS      SYS      NODE     NODE     NODE     NODE     NODE     NODE     SYS      32-63,96-127   1              
GPU6     MT2      MT2      MT2      MT2      MT2      MT2      X        MT2      SYS      SYS      SYS      SYS      NODE     NODE     NODE     NODE     MPB      MPB      SYS      32-63,96-127   1              
GPU7     MT2      MT2      MT2      MT2      MT2      MT2      MT2      X        SYS      SYS      SYS      SYS      NODE     NODE     NODE     NODE     NODE     NODE     SYS      32-63,96-127   1              
NIC0     MPB      NODE     NODE     NODE     SYS      SYS      SYS      SYS      X        SPB      NODE     NODE     SYS      SYS      SYS      SYS      SYS      SYS      NODE     
NIC1     MPB      NODE     NODE     NODE     SYS      SYS      SYS      SYS      SPB      X        NODE     NODE     SYS      SYS      SYS      SYS      SYS      SYS      NODE     
NIC2     NODE     NODE     MPB      NODE     SYS      SYS      SYS      SYS      NODE     NODE     X        SPB      SYS      SYS      SYS      SYS      SYS      SYS      NODE     
NIC3     NODE     NODE     MPB      NODE     SYS      SYS      SYS      SYS      NODE     NODE     SPB      X        SYS      SYS      SYS      SYS      SYS      SYS      NODE     
NIC4     SYS      SYS      SYS      SYS      NODE     NODE     NODE     NODE     SYS      SYS      SYS      SYS      X        SPB      NODE     NODE     NODE     NODE     SYS      
NIC5     SYS      SYS      SYS      SYS      NODE     NODE     NODE     NODE     SYS      SYS      SYS      SYS      SPB      X        NODE     NODE     NODE     NODE     SYS      
NIC6     SYS      SYS      SYS      SYS      MPB      NODE     NODE     NODE     SYS      SYS      SYS      SYS      NODE     NODE     X        SPB      NODE     NODE     SYS      
NIC7     SYS      SYS      SYS      SYS      MPB      NODE     NODE     NODE     SYS      SYS      SYS      SYS      NODE     NODE     SPB      X        NODE     NODE     SYS      
NIC8     SYS      SYS      SYS      SYS      NODE     NODE     MPB      NODE     SYS      SYS      SYS      SYS      NODE     NODE     NODE     NODE     X        SPB      SYS      
NIC9     SYS      SYS      SYS      SYS      NODE     NODE     MPB      NODE     SYS      SYS      SYS      SYS      NODE     NODE     NODE     NODE     SPB      X        SYS      
NIC10    NODE     NODE     NODE     NODE     SYS      SYS      SYS      SYS      NODE     NODE     NODE     NODE     SYS      SYS      SYS      SYS      SYS      SYS      X        

Legend:
    X = Self
  SYS = Topology path that contains PCIe switches/bridges as well as multiple host bridges across NUMA nodes.
 NODE = Topology path that contains PCIe switches/bridges as well as multiple host bridges within a NUMA node.
  HPB = Topology path that contains PCIe switches/bridges as well as a single host bridge.
  MPB = Topology path that contains multiple PCIe switches/bridges (but no host bridge).
  SPB = Topology path that contains at most one PCIe switch/bridge.
  INT = Topology path that is created internally, for example 2 devices on a single S2000 card.
  MTx = Topology path that is a bonded set of x MTLinks.

NIC Legend:
  NIC0: mlx5_0
  NIC1: mlx5_1
  NIC2: mlx5_2
  NIC3: mlx5_3
  NIC4: mlx5_4
  NIC5: mlx5_5
  NIC6: mlx5_6
  NIC7: mlx5_7
  NIC8: mlx5_8
  NIC9: mlx5_9
 NIC10: mlx5_bond_0


ulimit soft: 1048576
root@worker3218:/ws#

Accuracy Tests

Benchmarking and Profiling

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.
Follow the SGLang code style guidance.

Review Process

Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
Get approvals from CODEOWNERS and other reviewers.
Trigger CI tests with comments or contact authorized users to do so.
- /tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci
After green CI and required approvals, ask Merge Oncalls to merge.

gemini-code-assist · 2026-01-12T12:58:59Z

Summary of Changes

Hello @yeahdongcn, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request lays the groundwork for integrating Moore Threads GPUs into the SGLang ecosystem by establishing robust environment detection and dependency management for MUSA. It ensures that SGLang can correctly identify and configure itself for MUSA-enabled hardware, which is a critical prerequisite for future work on accelerating large language model inference on these new GPU architectures. The changes are focused on system-level compatibility rather than direct performance optimizations at this stage.

Highlights

MUSA GPU Support Initialization: This pull request introduces the foundational elements for supporting Moore Threads (MUSA) GPUs within SGLang, marking the first step in a series to enable LLM inference acceleration on this new hardware platform.
Enhanced Environment Checking: The sglang.check_env utility has been significantly updated to detect and report detailed MUSA environment information, including GPU device specifics, MUSA_HOME path, MCC version, MUSA driver version, and a comprehensive GPU topology.
New MUSA Detection Utility: A new is_musa() function has been added to sglang.srt.utils.common.py to provide a programmatic way to determine if a MUSA environment is active and available.
Dependency Updates for MUSA: The pyproject.toml files have been modified to include bidict and a new srt_musa dependency group, which specifies essential packages like torch_musa and mthreads-ml-py required for MUSA compatibility.
MUSA Installation Documentation: New documentation has been created at docs/platforms/mthreads_gpu.md to guide users through the process of installing SGLang from source with MUSA support.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces support for Moore Threads GPUs (MUSA) by adding environment checking capabilities and updating dependency configurations. The changes are well-structured and follow the existing pattern for platform support.

My review includes suggestions to improve security by avoiding shell=True in subprocess calls, enhance code clarity by removing unused variables and redundant checks, and fix a potential UnboundLocalError in the environment check script. I've also suggested an improvement to the new documentation to make it clearer for users.

ispobock · 2026-01-14T12:42:44Z

/tag-and-rerun-ci

yeahdongcn · 2026-01-15T01:07:07Z

@ispobock Thanks for reviewing this! I noticed there are 7 failing cases. After checking the logs, the failures are mainly due to OOM, timeouts, and connection issues, which don’t appear to be related to this PR.

yeahdongcn · 2026-01-16T10:47:00Z

Rebased onto upstream/main to resolve conflicts.

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

yeahdongcn · 2026-01-23T07:47:34Z

Rebased onto upstream/main.

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

yeahdongcn requested review from Fridge003, ispobock and merrymercy as code owners January 12, 2026 12:58

github-actions Bot added documentation Improvements or additions to documentation dependencies Pull requests that update a dependency file labels Jan 12, 2026

yeahdongcn mentioned this pull request Jan 12, 2026

[Roadmap][Feature] Support Moore Threads (MUSA) GPU #16565

Open

2 tasks

gemini-code-assist Bot reviewed Jan 12, 2026

View reviewed changes

Comment thread python/sglang/check_env.py

Comment thread python/sglang/check_env.py

Comment thread docs/platforms/mthreads_gpu.md Outdated

Comment thread python/sglang/check_env.py Outdated

Comment thread python/sglang/check_env.py

yeahdongcn added the mthreads label Jan 13, 2026

ispobock approved these changes Jan 14, 2026

View reviewed changes

github-actions Bot added the run-ci label Jan 14, 2026

yeahdongcn force-pushed the xd/musa_check_env branch from 4752ae1 to daebb82 Compare January 16, 2026 10:44

yeahdongcn mentioned this pull request Jan 17, 2026

[MUSA][4/N] Add common device utilities, distributed backend, and custom op wiring #17246

Merged

5 tasks

[MUSA][1/N] sglang.check_env

b40605f

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

yeahdongcn force-pushed the xd/musa_check_env branch from daebb82 to b40605f Compare January 23, 2026 07:47

Kangyan-Zhou merged commit a77729a into sgl-project:main Jan 23, 2026
105 of 109 checks passed

Johnsonms pushed a commit to Johnsonms/sglang that referenced this pull request Feb 14, 2026

[MUSA][1/N] sglang.check_env (sgl-project#16959)

58f7416

Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MUSA][1/N] sglang.check_env#16959

[MUSA][1/N] sglang.check_env#16959
Kangyan-Zhou merged 1 commit intosgl-project:mainfrom
yeahdongcn:xd/musa_check_env

yeahdongcn commented Jan 12, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot commented Jan 12, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ispobock commented Jan 14, 2026

Uh oh!

yeahdongcn commented Jan 15, 2026

Uh oh!

yeahdongcn commented Jan 16, 2026

Uh oh!

yeahdongcn commented Jan 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

yeahdongcn commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Testing Done

Accuracy Tests

Benchmarking and Profiling

Checklist

Review Process

Uh oh!

gemini-code-assist Bot commented Jan 12, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ispobock commented Jan 14, 2026

Uh oh!

yeahdongcn commented Jan 15, 2026

Uh oh!

yeahdongcn commented Jan 16, 2026

Uh oh!

yeahdongcn commented Jan 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yeahdongcn commented Jan 12, 2026 •

edited

Loading