[Feature] Support Deepseek-VL2#2798
Conversation
| @@ -0,0 +1,127 @@ | |||
| from typing import List,Optional,Tuple,Union | |||
There was a problem hiding this comment.
rename the file to deepseek_vl2?
|
|
||
| self.layers = modules | ||
|
|
||
| def forward(self, x): |
There was a problem hiding this comment.
There was a problem hiding this comment.
I have not yet implemented the forward part of the DeepseekV2ForCausalLM. I will finish all the implementations and add the unit test this weekend.
|
@ccw1996 Do you need our help? |
|
Has support for deepseek vl2 been implemented? |
| if config.projector_type == "downsample_mlp_gelu": | ||
| mlp_depth = config.depth | ||
| mlp_ratio = config.mlp_ratio | ||
| modules = [nn.Linear(config.input_dim * config.downsample_ratio * config.downsample_ratio, config.n_embed * mlp_ratio)] | ||
| for _ in range(1, mlp_depth - 1): | ||
| modules.append(nn.GELU()) | ||
| modules.append(nn.Linear(config.n_embed * mlp_ratio, config.n_embed * mlp_ratio)) | ||
| modules.append(nn.GELU()) | ||
| modules.append(nn.Linear(config.n_embed * mlp_ratio, config.n_embed)) | ||
| modules = nn.Sequential(*modules) |
There was a problem hiding this comment.
@ccw1996 I'm happy to take the rest of the work to parallelize the remaining functions. Could you give me access to your branch?
|
@ccw1996 Apologies for the delay. Would you like me to help with the rest of it? |
|
@ccw1996 I see, I think you can copy those layers from timm into python/sglang/srt/models/deepseekvl2.py, and then replace layers with sgl classes. I'm interested in helping if you can give me access. |
|
@yizhang2077 @ispobock Looks like we'll have to copy lots of code from timm--now mostly just the linear layers with variable depth to parallelize, will finish soon |
Sure, can you mark the problematic part? |
| if config.projector_type == "downsample_mlp_gelu": | ||
| mlp_depth = config.depth | ||
| mlp_ratio = config.mlp_ratio | ||
| modules = [ | ||
| nn.Linear( | ||
| config.input_dim | ||
| * config.downsample_ratio | ||
| * config.downsample_ratio, | ||
| config.n_embed * mlp_ratio, | ||
| ) | ||
| ] | ||
| for _ in range(1, mlp_depth - 1): | ||
| modules.append(nn.GELU()) | ||
| modules.append( | ||
| nn.Linear(config.n_embed * mlp_ratio, config.n_embed * mlp_ratio) | ||
| ) | ||
| modules.append(nn.GELU()) | ||
| modules.append(nn.Linear(config.n_embed * mlp_ratio, config.n_embed)) | ||
| modules = nn.Sequential(*modules) | ||
|
|
There was a problem hiding this comment.
Need to parallelize this part with Column and Row linear
There was a problem hiding this comment.
@yizhang2077 Actually with GELU we'll have to gather output for each TP linear. Should we use replicated linear instead?
two problem. one is radix cache will make input error, i will try to fix it. the second is output seems like not use images embedding. Can you help me to debug it? |
|
Let me try tomorrow |
| logger.info( | ||
| "Automatically turn off --chunked-prefill-size and disable radix cache for deekseek-vl2." | ||
| ) | ||
| server_args.chunked_prefill_size = -1 | ||
| server_args.disable_radix_cache = True |
There was a problem hiding this comment.
The language part still supports radix cache.
There was a problem hiding this comment.
The language part relay on input embed. If use radix cache, the input embed is wrong. I will try to debug it.
There was a problem hiding this comment.
I see, I think you're right. Llava and qwen_vl also don't use radix attn
Has this been done? @ccw1996 @yizhang2077 basically, do not paste codes but rather import it when needed |
Sorry, i missed this comment. |
|
I initially thought it's unnecessary to add a dependency just for one model, cuz you'll also have to update CI😂but it's fine if you guys agree |
i think better import it. |
@zhaochenyang20 @yizhang2077 done. Does i need to update branch? |
@yizhang2077 i can not update this sheet |
You can paste result here, and then I update it |
sglang is 0.442 and hf is not support |
|
@yizhang2077 @ccw1996 what shall we do next? I have the access and I can merge after yi's approval. |
@zhaochenyang20 @yizhang2077 i resolve merge conflict about gemma3. now waiting for approval. |
|
sure! |
|
@yizhang2077 @mickqian could you help to give a quick overview? |
|
@ccw1996 fix lint plz |
|
LGTM |
|
@ccw1996 sure. I will run the CI for you and do not rebase any more. leave it to us |
|
@ccw1996 please add pip install timm in |
I've done it |
Motivation
Add Deepseek-VL2 model to SGLang, as requested in #2653
Modifications
Checklist