Skip to content

Merged sglang-main qkv z b a fused func and quant config#232

Merged
qichu-yun merged 1 commit intozejunchen-zejun:Qwen3.5_v0.5.9from
IzacharyI:Qwen3.5_v0.5.9
Mar 31, 2026
Merged

Merged sglang-main qkv z b a fused func and quant config#232
qichu-yun merged 1 commit intozejunchen-zejun:Qwen3.5_v0.5.9from
IzacharyI:Qwen3.5_v0.5.9

Conversation

@IzacharyI
Copy link
Copy Markdown

@IzacharyI IzacharyI commented Mar 31, 2026

Motivation

Merged sglang-main qkv z b a fused func and quant config. Added PTPC quant config and BF16 4GEMM fused.

Modifications

Merged sglang-main qkv z b a fused func and quant config

Accuracy Tests

PTPC FP8:
image
BF16:
image

Benchmarking and Profiling

Checklist

@qichu-yun qichu-yun merged commit 2d5c465 into zejunchen-zejun:Qwen3.5_v0.5.9 Mar 31, 2026
1 check passed
qichu-yun pushed a commit that referenced this pull request Apr 3, 2026
- Cherry-pick PR sgl-project#21019: Fuse GDN split/reshape/cat ops with FP8/BF16 quant support
- Add BF16 qkv z b a fusion and PTPC quant config
qichu-yun added a commit that referenced this pull request Apr 3, 2026
qichu-yun added a commit that referenced this pull request Apr 3, 2026
qichu-yun pushed a commit that referenced this pull request Apr 3, 2026
- Cherry-pick PR sgl-project#21019 load weight func
- Add BF16 qkv z b a fusion and PTPC quant config
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants