feat: DeepSeek new v3.2 encoding by Eva20150932-atlascloud · Pull Request #14249 · sgl-project/sglang

Eva20150932-atlascloud · 2025-12-01T21:55:55Z

Motivation

#14227
DeepSeek official release a new encoding func to replace chat_template, and I made one workable version(though it is hard-coded and breaks other models)

if you still use old chat-template for the formal v3.2, then it works bad for tool_calling. so we need new encoding to run new v3.2 model.

Modifications

Accuracy Tests

start a server like python3 -m sglang.launch_server --model-path deepseek-ai/DeepSeek-V3.2 --trust-remote-code --tp-size 8 --host 0.0.0.0 --tool-call-parser deepseekv32 --enable-metrics --max-queued-requests 3 --max-running-requests 64 --cuda-graph-max-bs 64 --reasoning-parser deepseek-v3 and my test for tool_calling passed.

Benchmarking and Profiling

Checklist

[*] Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.
Follow the SGLang code style guidance.
Work with maintainers to merge your PR. See the PR Merge Process

gemini-code-assist · 2025-12-01T21:55:58Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

Fridge003 · 2025-12-01T23:08:19Z

@Eva20150932-atlascloud Can we be compatible with the former template of DeepSeek-v32-Exp?

Eva20150932-atlascloud · 2025-12-01T23:10:19Z

I've tried old v32 chat template. but model doesn't work for my tool-call tests

Fridge003 · 2025-12-01T23:13:03Z

I've tried old v32 chat template. but model doesn't work for my tool-call tests

I mean can we put the different chat templates in separate files, and apply them to the different models (V32/V32-Exp)

Johnsonms · 2025-12-01T23:16:03Z

Verified it works

Eva20150932-atlascloud · 2025-12-01T23:18:10Z

@Fridge003 possible, though it needs to set xml attribute, and I'm not experienced in building jinja

…=ChoiceDeltaToolCallFunction(arguments={}, name=None), type=function)] when streaming

…Added detection logic for using DPSK V3.2 encoding based on tokenizer configuration and architecture. Updated tests to validate encoding path and functionality. Adapted encoding_dsv32.py from Hugging Face repository. Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

Fridge003 · 2025-12-02T03:43:24Z

/tag-and-rerun-ci

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

JustinTong0323 · 2025-12-02T05:51:06Z

I believe this PR is ready cc @Fridge003

jimmy-evo · 2025-12-02T06:54:22Z

+        self.use_dpsk_v32_encoding = self._use_dpsk_v32_encoding()
+
+    def _use_dpsk_v32_encoding(self) -> bool:
+        has_chat_template = (
+            self.tokenizer_manager.tokenizer is not None
+            and self.tokenizer_manager.tokenizer.chat_template is not None
+        )
+        architectures = self.tokenizer_manager.server_args.get_hf_config().architectures
+        is_dpsk_v32 = "DeepseekV3" in architectures[0] if architectures else False
+        return not has_chat_template and is_dpsk_v32
+


self.use_dpsk_v32_encoding = self.tokenizer_manager.server_args.tool_call_parser == "deepseekv32"

just dont determine with "architectures", but "tool_call_parser"

should not as the tool_call_parser is not necessary in some cases, but this code path matters

we can just add an environ SGLANG_USE_DPSKV32_ENCODING=True, then no need to concern how to determine this

@JustinTong0323 i think this custom encode is a temp way. actually it is a kind of def apply_chat_template

Not quite sure, do you mean we should not default it? But this code is adapted from deepseek's hf repo so I think it should be enabled by default

Not quite sure, do you mean we should not default it? But this code is adapted from deepseek's hf repo so I think it should be enabled by default

do you remember when huggingface transforers~=4.2x (2023/2024), open source models usually provide a tokenizer.py with def apply_chat_template.

this encoding_dsv32.py is that def apply_chat_template.

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

liaol · 2025-12-02T09:47:45Z

+
+                    # Check if invoke_content is empty or whitespace only
+                    # If so, skip this tool call entirely (it's likely incomplete or malformed)
+                    if not invoke_content.strip():


Here will ignore the non-parameter function.
Like :

<｜DSML｜invoke name="get_current_time"> </｜DSML｜invoke>

Would fix that later.

soaringk · 2025-12-02T14:26:50Z

There is a parse_tool_calls function in encoding_dsv32.py. I reckon we should use that one to parse function calls?

Co-authored-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

Muqi1029 · 2025-12-08T03:08:29Z

Hi, may I ask why use the while loop in parse_streaming_increment? @Eva20150932-atlascloud

Eva20150932-atlascloud · 2025-12-08T03:32:55Z

Do you mean we only need to prepare for parsing one invoke-block, since the model generates only one token per forward?

PR(#11652) to support MTP on v3.2 makes generating more than one invoke-block possible (though very low possibility). And by the way, I think it's harmless to use the while loop as it would break once the invoke-regex is not matched. @Muqi1029

Co-authored-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

Muqi1029 · 2025-12-08T03:45:30Z

But even though MTP generates more than one tokens once, the logic without while loop can still handle that next time since it's still in the buffer, right?

Eva20150932-atlascloud · 2025-12-08T03:58:32Z

That logic sounds good, and it makes me rethink things.

Could there be a case where the response doesn't have a 'next time'? For instance, what if the MTP forward generates the eos token?

Muqi1029 · 2025-12-09T03:02:18Z

@Eva20150932-atlascloud Thanks for you answering! I think maybe you are right, while loop should be kept!

But here I have met another question, why you use these markers here?

sglang/python/sglang/srt/function_call/deepseekv32_detector.py

Lines 185 to 201 in ef3f8c9

    
           # Check if buffer contains any DSML markers or ends with potential tag prefix 
        
           # This handles partial/streaming DSML content 
        
           dsml_markers = ["｜DSML｜", "<｜", "</｜"] 
        
           potentially_dsml = any(marker in current_text for marker in dsml_markers) 
        
           # Also check if text ends with start of a tag (to handle "<" arriving separately) 
        
           dsml_prefixes = ["<", "<｜", "</", "</｜"] 
        
           ends_with_prefix = any( 
        
               current_text.rstrip().endswith(prefix) for prefix in dsml_prefixes 
        
           ) 
        
           if not has_tool_call and not potentially_dsml and not ends_with_prefix: 
        
               self._buffer = "" 
        
               for e_token in [self.eot_token, self.invoke_end_token]: 
        
                   if e_token in new_text: 
        
                       new_text = new_text.replace(e_token, "") 
        
               return StreamingParseResult(normal_text=new_text)

I think model output in the token level, you can use the following scripts to see the tokens:

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-V3.2")

special_tokens = [
    "<｜DSML｜function_calls>",
    "</｜DSML｜function_calls>",
    "<｜DSML｜invoke",
    "</｜DSML｜invoke",
]

for tokens in special_tokens:
    print("\n\n")
    print(f" Processing {tokens} ".center(80, "-"))
    ids = tokenizer.encode(tokens, add_special_tokens=False)
    for id in ids:
        tokens = tokenizer.decode(id)
        print(f"'{tokens}' : {id}")

The output is as follows:


---------------------- Processing <｜DSML｜function_calls> -----------------------
'<' : 30
'｜DSML｜' : 128793
'function' : 8701
'_c' : 4941
'alls' : 12548
'>' : 32



---------------------- Processing </｜DSML｜function_calls> ----------------------
'</' : 1718
'｜DSML｜' : 128793
'function' : 8701
'_c' : 4941
'alls' : 12548
'>' : 32



--------------------------- Processing <｜DSML｜invoke ---------------------------
'<' : 30
'｜DSML｜' : 128793
'inv' : 40148
'oke' : 5406



-------------------------- Processing </｜DSML｜invoke ---------------------------
'</' : 1718
'｜DSML｜' : 128793
'inv' : 40148
'oke' : 5406

So <｜ will not be generated forever. Right?

jxz542189 · 2025-12-09T09:03:10Z

@Eva20150932-atlascloud Can you ensure that the function calls are output in the expected streaming manner? #14711

Muqi1029 · 2025-12-09T09:13:47Z

@Eva20150932-atlascloud Can you ensure that the function calls are output in the expected streaming manner? #14711

@jxz542189
It indeed doesn't support Streaming output, I am working on this, planning to submit a PR tonight

This reverts commit 7c38eca.

whybeyoung · 2026-01-07T01:10:25Z

When using smg and grpc mode, i think it should do similar thing with this pr @slin1237

hard code hacking for DeepSeek new v3.2 encoding

c732a17

Eva20150932-atlascloud requested review from CatherineSue, JustinTong0323, ispobock, merrymercy and slin1237 as code owners December 1, 2025 21:55

github-actions Bot added the deepseek label Dec 1, 2025

Eva20150932-atlascloud mentioned this pull request Dec 1, 2025

[Feature] support for new v3.2 encoding(replace for chat-template) #14227

Closed

2 tasks

add back json-body parse

2a863d1

Eva20150932-atlascloud and others added 2 commits December 1, 2025 23:27

remove last reduntant [ChoiceDeltaToolCall(index=1, id=None, function…

63caf31

…=ChoiceDeltaToolCallFunction(arguments={}, name=None), type=function)] when streaming

Merge branch 'main' into v32_encoding

5232506

Fridge003 assigned JustinTong0323 Dec 2, 2025

JustinTong0323 added 2 commits December 2, 2025 02:56

fix

a15ea97

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

yudian0504 reviewed Dec 2, 2025

View reviewed changes

Comment thread python/sglang/srt/entrypoints/openai/serving_chat.py

JustinTong0323 changed the title ~~hard code hacking for DeepSeek new v3.2 encoding~~ feat: DeepSeek new v3.2 encoding Dec 2, 2025

yudian0504 reviewed Dec 2, 2025

View reviewed changes

Comment thread python/sglang/srt/entrypoints/openai/serving_chat.py

yudian0504 reviewed Dec 2, 2025

View reviewed changes

Comment thread python/sglang/srt/entrypoints/openai/serving_chat.py

github-actions Bot added the run-ci label Dec 2, 2025

yudian0504 reviewed Dec 2, 2025

View reviewed changes

Comment thread python/sglang/srt/entrypoints/openai/serving_chat.py

refactor code path to avoid excution of dead codes

0bdbe4d

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

jimmy-evo suggested changes Dec 2, 2025

View reviewed changes

JustinTong0323 and others added 2 commits December 2, 2025 08:22

fix test_serving_chat

1aed8ed

Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

Merge branch 'main' into v32_encoding

82db99a

liaol reviewed Dec 2, 2025

View reviewed changes

Fridge003 approved these changes Dec 2, 2025

View reviewed changes

Fridge003 merged commit 7c38eca into sgl-project:main Dec 2, 2025
174 of 185 checks passed

This was referenced Dec 2, 2025

[Doc] Update DeepSeek-V3.2 document #14321

Merged

[Bug]DeepSeek-V3.2 server crashes and becomes unreachable #14275

Closed

harvenstar pushed a commit to harvenstar/sglang that referenced this pull request Dec 4, 2025

feat: DeepSeek new v3.2 encoding (sgl-project#14249)

2a43ae9

Co-authored-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

yingluosanqian pushed a commit to yingluosanqian/sglang that referenced this pull request Dec 4, 2025

feat: DeepSeek new v3.2 encoding (sgl-project#14249)

2541ae5

Co-authored-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

tonyluj pushed a commit to openanolis/sglang that referenced this pull request Dec 5, 2025

feat: DeepSeek new v3.2 encoding (sgl-project#14249)

71df641

Co-authored-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

yuchengz816-bot pushed a commit to yuchengz816-bot/sglang that referenced this pull request Dec 8, 2025

feat: DeepSeek new v3.2 encoding (sgl-project#14249)

58e7c46

Co-authored-by: Xinyuan Tong <xinyuantong.cs@gmail.com>

whybeyoung added a commit to whybeyoung/sglang that referenced this pull request Jan 5, 2026

Revert "feat: DeepSeek new v3.2 encoding (sgl-project#14249)"

e9aa965

This reverts commit 7c38eca.

whybeyoung mentioned this pull request Jan 5, 2026

[Bug] dsv3 continue_fina_message error #16501

Closed

5 tasks

whybeyoung mentioned this pull request Jan 8, 2026

Fix v32 continue_final_message not work #16567

Merged

Conversation

Eva20150932-atlascloud commented Dec 1, 2025

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Uh oh!

gemini-code-assist Bot commented Dec 1, 2025

Uh oh!

Fridge003 commented Dec 1, 2025

Uh oh!

Eva20150932-atlascloud commented Dec 1, 2025

Uh oh!

Fridge003 commented Dec 1, 2025

Uh oh!

Johnsonms commented Dec 1, 2025

Uh oh!

Eva20150932-atlascloud commented Dec 1, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Fridge003 commented Dec 2, 2025

Uh oh!

Uh oh!

JustinTong0323 commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jimmy-evo Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JustinTong0323 Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jimmy-evo Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

jimmy-evo Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

JustinTong0323 Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

jimmy-evo Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

liaol Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

JustinTong0323 Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

soaringk commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Muqi1029 commented Dec 8, 2025

Uh oh!

Eva20150932-atlascloud commented Dec 8, 2025

Uh oh!

Muqi1029 commented Dec 8, 2025

Uh oh!

Eva20150932-atlascloud commented Dec 8, 2025

Uh oh!

Muqi1029 commented Dec 9, 2025

Uh oh!

jxz542189 commented Dec 9, 2025

Uh oh!

Muqi1029 commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

whybeyoung commented Jan 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

JustinTong0323 commented Dec 2, 2025 •

edited

Loading

jimmy-evo Dec 2, 2025 •

edited

Loading

JustinTong0323 Dec 2, 2025 •

edited

Loading

soaringk commented Dec 2, 2025 •

edited

Loading

Muqi1029 commented Dec 9, 2025 •

edited

Loading