[Docs] Add quantization docs by Edenzzzz · Pull Request #3410 · sgl-project/sglang

Edenzzzz · 2025-02-08T23:26:28Z

Motivation

Re-opens #3253 with reviews addressed.

Modifications

Checklist

Format your code according to the Code Formatting with Pre-Commit.
Add unit tests as outlined in the Running Unit Tests.
Update documentation / docstrings / example tutorials as needed, according to Writing Documentation.
Provide throughput / latency benchmark results and accuracy evaluation results as needed, according to Benchmark and Profiling and Accuracy Results.
For reviewers: If you haven't made any contributions to this PR and are only assisting with merging the main branch, please remove yourself as a co-author when merging the PR.

Edenzzzz · 2025-02-08T23:26:54Z

cc @zhaochenyang20

Edenzzzz · 2025-02-09T01:17:20Z

+    --port 30000 --host 0.0.0.0
+```
+
+Our team is working on supporting more online quantization methods. We will soon support methods including but not limited to `["awq", "gptq", "marlin", "gptq_marlin", "awq_marlin", "bitsandbytes", "gguf"]`


I think this means online quantization? Loading offline awq weights is already supported.

zhaochenyang20 · 2025-02-09T07:26:07Z

Thanks. I will give credit to you, james and fan.

zhaochenyang20

We should move it to reference and add it in index.rst.

Edenzzzz · 2025-02-09T18:11:15Z

@zhaochenyang20 Added

zhyncs · 2025-02-09T18:22:13Z

FYI In the upcoming release, we will default to using sgl-kernel's W8A8 Int8 and FP8 instead of vLLM's W8A8. We have achieved best performance across on all sm80, sm89 and sm90.

zhaochenyang20 · 2025-02-09T18:29:41Z

Great. Wait, we need to change this a bit

Co-authored-by: yinfan98 <1106310035@qq.com>

FlamingoPg and others added 18 commits February 1, 2025 22:17

Create quantization.md

29b580f

Create quantization.ipynb

0fc00f8

Update quantization.ipynb

24ee864

Update quantization.ipynb

89c390b

Update quantization.ipynb

1412ec9

Update quantization.ipynb

9777806

Update quantization.ipynb

020ef7a

Update quantization.ipynb

b2a713a

Update quantization.ipynb

2a8c33b

Update quantization.ipynb

1e771e7

Update quantization.ipynb

e55d75e

Update quantization.ipynb

3cc21e4

Update quantization.ipynb

1245321

Update quantization.ipynb

1fe89de

Update quantization.ipynb

e4f3253

Merge branch 'main' into quantize-docs

7bd7d2a

Cleanup docs

53e02d5

Merge branch 'main' into quantization_docs

304f589

Edenzzzz commented Feb 9, 2025

View reviewed changes

Merge branch 'main' into quantization_docs

7c63f7a

Edenzzzz added 2 commits February 9, 2025 07:51

Merge branch 'main' into quantization_docs

c5b19bb

Merge branch 'main' into quantization_docs

35a94b9

zhaochenyang20 reviewed Feb 9, 2025

View reviewed changes

add to index.rst:

cae3b27

Merge branch 'main' into quantization_docs

d26365b

zhyncs reviewed Feb 9, 2025

View reviewed changes

Comment thread docs/references/quantization.md Outdated

zhyncs reviewed Feb 9, 2025

View reviewed changes

Comment thread docs/references/quantization.md Outdated

zhyncs added 2 commits February 10, 2025 02:14

upd

26b51bb

upd

708bdc9

zhyncs reviewed Feb 9, 2025

View reviewed changes

Comment thread docs/references/quantization.md Outdated

Comment thread docs/references/quantization.md Outdated

zhyncs added 2 commits February 10, 2025 02:15

upd

d315399

upd

24167bd

zhyncs approved these changes Feb 9, 2025

View reviewed changes

zhyncs merged commit 0af1d23 into sgl-project:main Feb 9, 2025

Edenzzzz deleted the quantization_docs branch February 9, 2025 18:19

This was referenced Feb 21, 2025

[Docs] add quantization docs #3253

Closed

[Docs] add quantization docs #2572

Closed

timethink pushed a commit to timethink/sglang that referenced this pull request Mar 9, 2025

[Docs] Add quantization docs (sgl-project#3410)

659d3c9

Co-authored-by: yinfan98 <1106310035@qq.com>

b8zhong mentioned this pull request May 24, 2025

[Feature] Add Docs For Quantization #2531

Closed

0826joyce pushed a commit to 0826joyce/sglang-perf-opt that referenced this pull request May 19, 2026

[Docs] Add quantization docs (sgl-project#3410)

2cf4fa4

Co-authored-by: yinfan98 <1106310035@qq.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Docs] Add quantization docs#3410

[Docs] Add quantization docs#3410
zhyncs merged 27 commits into
sgl-project:mainfrom
Edenzzzz:quantization_docs

Edenzzzz commented Feb 8, 2025 •

edited

Loading

Uh oh!

Edenzzzz commented Feb 8, 2025

Uh oh!

Edenzzzz Feb 9, 2025

Uh oh!

zhaochenyang20 commented Feb 9, 2025

Uh oh!

zhaochenyang20 left a comment

Uh oh!

Edenzzzz commented Feb 9, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zhyncs commented Feb 9, 2025

Uh oh!

zhaochenyang20 commented Feb 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

Edenzzzz commented Feb 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Checklist

Uh oh!

Edenzzzz commented Feb 8, 2025

Uh oh!

Edenzzzz Feb 9, 2025

Choose a reason for hiding this comment

Uh oh!

zhaochenyang20 commented Feb 9, 2025

Uh oh!

zhaochenyang20 left a comment

Choose a reason for hiding this comment

Uh oh!

Edenzzzz commented Feb 9, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zhyncs commented Feb 9, 2025

Uh oh!

zhaochenyang20 commented Feb 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Edenzzzz commented Feb 8, 2025 •

edited

Loading