Skip to content

[AMD] fix amd ci dpskv32#17432

Merged
HaiShaw merged 9 commits intosgl-project:mainfrom
yctseng0211:fix_dpsk_0120
Jan 22, 2026
Merged

[AMD] fix amd ci dpskv32#17432
HaiShaw merged 9 commits intosgl-project:mainfrom
yctseng0211:fix_dpsk_0120

Conversation

@yctseng0211
Copy link
Copy Markdown
Collaborator

@yctseng0211 yctseng0211 commented Jan 20, 2026

Motivation

Fix the runtime error from PR-17205 :
https://github.com/sgl-project/sglang/actions/runs/21157007917/job/60858903195?pr=17205#step:6:17583

  File "/sglang-checkout/python/sglang/srt/layers/attention/nsa/nsa_indexer.py", line 1008, in forward_cuda
    weights = self._get_logits_head_gate(x_for_gate, q_scale)
  File "/sglang-checkout/python/sglang/srt/layers/attention/nsa/nsa_indexer.py", line 230, in _get_logits_head_gate
    weights, _ = self.weights_proj(x)
  File "/opt/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
    return forward_call(*args, **kwargs)
  File "/sglang-checkout/python/sglang/srt/layers/linear.py", line 260, in forward
    output = self.quant_method.apply(self, x, bias)
  File "/sglang-checkout/python/sglang/srt/layers/quantization/unquant.py", line 143, in apply
    return F.linear(x, layer.weight, bias)
RuntimeError: expected mat1 and mat2 to have the same dtype, but got: float != c10::BFloat16

Modifications

Accuracy Tests

image

Benchmarking and Profiling

Checklist

Review Process

  1. Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or contact authorized users to do so.
    • /tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci
  4. After green CI and required approvals, ask Merge Oncalls to merge.

…istration times to 5400 for DeepSeek V3.2 tests.
@michaelzhang-ai
Copy link
Copy Markdown
Collaborator

https://github.com/sgl-project/sglang/actions/runs/21209864077?pr=17432 passed and ready to merge. The PR will fix current dpv32 issue and largely improve queue time of mi35x. @HaiShaw cc: @yctseng0211

Copy link
Copy Markdown
Collaborator

@hubertlu-tw hubertlu-tw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Thanks for the fix.

Comment thread python/sglang/srt/layers/attention/nsa/nsa_indexer.py Outdated
Comment thread python/sglang/srt/layers/attention/nsa/nsa_indexer.py Outdated
@HaiShaw HaiShaw merged commit 17807ca into sgl-project:main Jan 22, 2026
33 of 74 checks passed
@HaiShaw
Copy link
Copy Markdown
Collaborator

HaiShaw commented Jan 22, 2026

Only changed to AMD path

Johnsonms pushed a commit to Johnsonms/sglang that referenced this pull request Feb 14, 2026
Co-authored-by: michaelzhang-ai <michaelzhang.ai@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants