Skip to content

[NPU]bugfix: fix for dsv3.2 and dsvl2#17007

Merged
iforgetmyname merged 7 commits intosgl-project:mainfrom
JiaruiChang5268:eagle-dp-attn-ds
Jan 23, 2026
Merged

[NPU]bugfix: fix for dsv3.2 and dsvl2#17007
iforgetmyname merged 7 commits intosgl-project:mainfrom
JiaruiChang5268:eagle-dp-attn-ds

Conversation

@JiaruiChang5268
Copy link
Copy Markdown
Contributor

@JiaruiChang5268 JiaruiChang5268 commented Jan 13, 2026

Motivation

1、There are some bugs for DS-Vl2 on rotary_embedding
2、DSV32 is not compatible with the scenario where m.alt_stream is not None.

Modifications

1、Fix the ds-v12 rotary_embedding bug and add a branch to the RotaryEmebedding class.
2、The fix ds-v32 is compatible with the scenario where m.alt_stream is empty.

Accuracy Tests

Both (dsvl2/dsv3.2) models are covered by CI

Benchmarking and Profiling

Checklist

Review Process

  1. Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or contact authorized users to do so.
    • /tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci
  4. After green CI and required approvals, ask Merge Oncalls to merge.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@github-actions github-actions Bot added the quant LLM Quantization label Jan 16, 2026
@ping1jing2 ping1jing2 self-assigned this Jan 16, 2026
@ping1jing2 ping1jing2 marked this pull request as draft January 16, 2026 07:50
@ping1jing2
Copy link
Copy Markdown
Collaborator

please update the description

@xiaobaicxy
Copy link
Copy Markdown
Contributor

please update the tests

mla_event = torch.npu.Event()
mla_event.record()
with torch.npu.stream(m.alt_stream):
torch.npu.current_stream().wait_event(mla_event)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mla_event here is meaningless: main stream record event and then alt_stream wait this event ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add comment

k_nope = m.kv_a_layernorm(k_nope).unsqueeze(1)
torch.npu.current_stream().wait_event(q_event)
k_nope = m.kv_a_layernorm(k_nope)
if q_event is not None:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if m.alt_stream is not None:
current_stream.wait_stream(m.alt_stream)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dont need event / record

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add comment

@JiaruiChang5268 JiaruiChang5268 marked this pull request as ready for review January 21, 2026 09:34
@iforgetmyname iforgetmyname self-assigned this Jan 22, 2026
@iforgetmyname
Copy link
Copy Markdown
Collaborator

/tag-and-rerun-ci

@iforgetmyname iforgetmyname merged commit c0b5a18 into sgl-project:main Jan 23, 2026
172 of 184 checks passed
Johnsonms pushed a commit to Johnsonms/sglang that referenced this pull request Feb 14, 2026
Co-authored-by: Hexq0210 <893781835@qq.com>
Co-authored-by: liupeng374 <782420244@qq.com>
Co-authored-by: cy <chenyang08056032@163.com>
Todobe pushed a commit to Todobe/sgl-sglang that referenced this pull request Mar 3, 2026
Co-authored-by: Hexq0210 <893781835@qq.com>
Co-authored-by: liupeng374 <782420244@qq.com>
Co-authored-by: cy <chenyang08056032@163.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants