[NPU] Fix issue and support GLM-4.5V by zhsurpass · Pull Request #22961 · sgl-project/sglang

zhsurpass · 2026-04-16T09:36:33Z

Motivation

Fix issue and support GLM-4.5V on NPU.
Issue link: Ascend#343

Modifications

When calling the split_qkv_rmsnorm_rope function, pass the correct arguments based on the use_qk_norm parameter.The split_qkv_rmsnorm_rope kernel already supports NORMS=False mode internally.

Accuracy Tests

Accuracy on MMMU dataset:

-Accuracy: 0.2802
-Invalid: 0.000
-Latency: 89.380 s
-Output throughput: 33.565 token/s

Speed Tests and Profiling

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.
Follow the SGLang code style guidance.

Review and Merge Process

Ping Merge Oncalls to start the process. See the PR Merge Process.
Get approvals from CODEOWNERS and other reviewers.
Trigger CI tests with comments or contact authorized users to do so.
- Common commands include /tag-and-rerun-ci, /tag-run-ci-label, /rerun-failed-ci
After green CI and required approvals, ask Merge Oncalls or people with Write permission to merge the PR.

gemini-code-assist

Code Review

This pull request updates the forward_prepare method in glm4_moe.py to include self.use_qk_norm in the conditional logic for QKV splitting and removes a redundant blank line. A review comment suggests refactoring the complex conditional logic into a descriptive boolean variable to improve code readability and maintainability.

gemini-code-assist · 2026-04-16T09:43:28Z

        if (
            not _is_npu
            or forward_batch.forward_mode.is_extend_or_draft_extend_or_mixed()
+            or not self.use_qk_norm
        ):


The conditional logic is becoming complex. For better readability and maintainability, consider refactoring this to explicitly check for the conditions required for the NPU-specific path. This makes the intent clearer.

Suggested change

if (

not _is_npu

or forward_batch.forward_mode.is_extend_or_draft_extend_or_mixed()

or not self.use_qk_norm

):

use_npu_decode_path = (

_is_npu

and not forward_batch.forward_mode.is_extend_or_draft_extend_or_mixed()

and self.use_qk_norm

)

if not use_npu_decode_path:

…lm-4.5v

sglang-npu-bot · 2026-04-16T12:44:51Z

/tag-and-rerun-ci

zhsurpass added 2 commits April 16, 2026 10:15

support GLM-4.5V

6c59aa0

support GLM-4.5V

8a44d94

gemini-code-assist Bot reviewed Apr 16, 2026

View reviewed changes

zhsurpass and others added 2 commits April 16, 2026 18:10

support GLM-4.5V

f9242fc

Merge branch 'main' into glm-4.5v

9deb545

Hexq0210 approved these changes Apr 16, 2026

View reviewed changes

sglang-npu-bot approved these changes Apr 16, 2026

View reviewed changes

zhsurpass and others added 3 commits April 16, 2026 20:18

format code

5fcedb7

Merge branch 'glm-4.5v' of https://github.com/zhsurpass/sglang into g…

9ee8180

…lm-4.5v

Merge branch 'main' into glm-4.5v

b1b94dc

sglang-npu-bot approved these changes Apr 16, 2026

View reviewed changes

github-actions Bot added the run-ci label Apr 16, 2026

sglang-npu-bot and others added 5 commits April 18, 2026 14:24

Merge branch 'main' into glm-4.5v

5d76928

Merge branch 'main' into glm-4.5v

375e6f7

Merge branch 'main' into glm-4.5v

3b21954

Merge branch 'main' into glm-4.5v

f36113d

Merge branch 'main' into glm-4.5v

a0a5220

iforgetmyname self-assigned this Apr 23, 2026

iforgetmyname added 2 commits April 25, 2026 22:17

Merge branch 'main' into glm-4.5v

dcc40a4

Merge branch 'main' into glm-4.5v

0bc301d

iforgetmyname merged commit 9ffc0cc into sgl-project:main Apr 28, 2026
390 of 439 checks passed

zhsurpass mentioned this pull request Apr 29, 2026

[Bug] [Func] [SIT] The ZhipuAI/GLM-4.5V model service failed to start Ascend/sglang#343

Open

5 tasks

zhsurpass changed the title ~~[NPU] Support GLM-4.5V~~ [NPU] Fix issue and support GLM-4.5V Apr 29, 2026

vguduruTT pushed a commit to vguduruTT/sglang that referenced this pull request May 2, 2026

[NPU] Support GLM-4.5V (sgl-project#22961)

eb7d049

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[NPU] Fix issue and support GLM-4.5V#22961

[NPU] Fix issue and support GLM-4.5V#22961
iforgetmyname merged 14 commits intosgl-project:mainfrom
zhsurpass:glm-4.5v

zhsurpass commented Apr 16, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 16, 2026

Uh oh!

sglang-npu-bot commented Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

-        if (
-            not _is_npu
-            or forward_batch.forward_mode.is_extend_or_draft_extend_or_mixed()
-            or not self.use_qk_norm
-        ):
+        use_npu_decode_path = (
+            _is_npu
+            and not forward_batch.forward_mode.is_extend_or_draft_extend_or_mixed()
+            and self.use_qk_norm
+        )
+        if not use_npu_decode_path:

Conversation

zhsurpass commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Accuracy Tests

Speed Tests and Profiling

Checklist

Review and Merge Process

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

sglang-npu-bot commented Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

zhsurpass commented Apr 16, 2026 •

edited

Loading