Conversation
Summary of ChangesHello @Fridge003, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the documentation for the DeepSeek-V3.2 model family, providing clearer guidance on its various versions, their respective tool parsing configurations, and detailed instructions for launching them with SGLang. It also expands the accuracy testing section with new results and setup procedures for different DeepSeek-V3.2 variants. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request updates the documentation for DeepSeek-V3.2, including its variants. The changes are generally good and add valuable information. I've identified a few areas for improvement, including an incorrect model path in a command, some typos and grammatical errors, a broken markdown link, and some confusing or inconsistent formatting. My review includes specific suggestions to address these points and enhance the clarity and accuracy of the documentation.
| To launch `DeepSeek-V3.2-Exp` with function calling and reasoning parser: | ||
| ```bash | ||
| python3 -m sglang.launch_server \ | ||
| --model-path deepseek-ai/DeepSeek-V3.2 \ |
| "| `deepseekv31` | DeepSeek-V3.1 and DeepSeek-V3.2-Exp (e.g. `deepseek-ai/DeepSeek-V3.1`, `deepseek-ai/DeepSeek-V3.2-Exp`) | Recommend adding `--chat-template ./examples/chat_template/tool_chat_template_deepseekv31.jinja` (Or ..deepseekv32.jinja for DeepSeek-V3.2) to launch command. |\n", | ||
| "| `deepseekv32` | DeepSeek-V3.2 (`deepseek-ai/DeepSeek-V3.2`) | |\n", |
There was a problem hiding this comment.
With the addition of the deepseekv32 parser, the note for deepseekv31 has become confusing. It's better to simplify the note for deepseekv31 to only refer to its corresponding chat template and add a similar note for the new deepseekv32 parser.
| "| `deepseekv31` | DeepSeek-V3.1 and DeepSeek-V3.2-Exp (e.g. `deepseek-ai/DeepSeek-V3.1`, `deepseek-ai/DeepSeek-V3.2-Exp`) | Recommend adding `--chat-template ./examples/chat_template/tool_chat_template_deepseekv31.jinja` (Or ..deepseekv32.jinja for DeepSeek-V3.2) to launch command. |\n", | |
| "| `deepseekv32` | DeepSeek-V3.2 (`deepseek-ai/DeepSeek-V3.2`) | |\n", | |
| "| `deepseekv31` | DeepSeek-V3.1 and DeepSeek-V3.2-Exp (e.g. `deepseek-ai/DeepSeek-V3.1`, `deepseek-ai/DeepSeek-V3.2-Exp`) | Recommend adding `--chat-template ./examples/chat_template/tool_chat_template_deepseekv31.jinja` to launch command. |\n", | |
| "| `deepseekv32` | DeepSeek-V3.2 (`deepseek-ai/DeepSeek-V3.2`) | Recommend adding `--chat-template ./examples/chat_template/tool_chat_template_deepseekv32.jinja` to launch command. |\n", |
| # DeepSeek V3.2 Usage | ||
|
|
||
| [DeepSeek-V3.2-Exp](https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Exp) equips DeepSeek-V3.1-Terminus with DeepSeek Sparse Attention (DSA) through continued training. With DSA, a fine-grained sparse attention mechanism powered by a lightning indexer, DeepSeek-V3.2 achieves efficiency improvements in long-context scenarios. | ||
| DeepSeek-V3.2 model families equips DeepSeek-V3.1-Terminus with DeepSeek Sparse Attention (DSA) through continued training. With DSA, a fine-grained sparse attention mechanism powered by a lightning indexer, DeepSeek-V3.2 achieves efficiency improvements in long-context scenarios. |
There was a problem hiding this comment.
There is a grammatical error here. Since "model families" is plural, the verb should be "equip", not "equips".
| DeepSeek-V3.2 model families equips DeepSeek-V3.1-Terminus with DeepSeek Sparse Attention (DSA) through continued training. With DSA, a fine-grained sparse attention mechanism powered by a lightning indexer, DeepSeek-V3.2 achieves efficiency improvements in long-context scenarios. | |
| DeepSeek-V3.2 model families equip DeepSeek-V3.1-Terminus with DeepSeek Sparse Attention (DSA) through continued training. With DSA, a fine-grained sparse attention mechanism powered by a lightning indexer, DeepSeek-V3.2 achieves efficiency improvements in long-context scenarios. |
| pip install git+https://github.com/NVIDIA/NeMo-Skills.git --ignore-installed blinker | ||
| ``` | ||
|
|
||
| Nemo Skill can't enable thinking method from client side, so we need some hardcoding before launching server: |
There was a problem hiding this comment.
There's a typo here. For consistency with the library name, "Nemo Skill" should be "NeMo-Skills".
| Nemo Skill can't enable thinking method from client side, so we need some hardcoding before launching server: | |
| NeMo-Skills can't enable thinking method from client side, so we need some hardcoding before launching server: |
| Run the following script to evaluate AIME 2025: | ||
| **For `DeepSeek-V3.2` and `DeepSeek-V3.2-Speciale`**: | ||
|
|
||
| Hardcode the thinking mode to be `thinking` in (`_apply_jinja_template`)[https://github.com/sgl-project/sglang/blob/7c38eca1e4a704bf09fe6b52ea040a41d3cfc55d/python/sglang/srt/entrypoints/openai/serving_chat.py#L286`], then launch the server as usual: |
There was a problem hiding this comment.
The Markdown link syntax is incorrect, which breaks the link. It should be [text](url). Also, it's a good practice to link to the main branch instead of a specific commit hash to prevent the link from becoming outdated.
| Hardcode the thinking mode to be `thinking` in (`_apply_jinja_template`)[https://github.com/sgl-project/sglang/blob/7c38eca1e4a704bf09fe6b52ea040a41d3cfc55d/python/sglang/srt/entrypoints/openai/serving_chat.py#L286`], then launch the server as usual: | |
| Hardcode the thinking mode to be `thinking` in [`_apply_jinja_template`](https://github.com/sgl-project/sglang/blob/main/python/sglang/srt/entrypoints/openai/serving_chat.py#L286), then launch the server as usual: |
|
|
||
| Test results: | ||
|
|
||
| DeepSeek-V3.2-Exp: |
Motivation
Following #14249
Modifications
Accuracy Tests
Benchmarking and Profiling
Checklist