[Tracking] Model support
The goal is to support other model architectures available. Expand the model zoo 🎊
The goal is to implement support for all architectures listed below. Anyone is welcome to take any issue or implement the model below.
If you need help implementing a new model, see https://docs.sglang.ai/supported_models/support_new_models.html
Text-only Language Models (Generative)
Embedding Models
Multimodal Models
Related Issues & PRs
[Tracking] Model support
The goal is to support other model architectures available. Expand the model zoo 🎊
The goal is to implement support for all architectures listed below. Anyone is welcome to take any issue or implement the model below.
If you need help implementing a new model, see https://docs.sglang.ai/supported_models/support_new_models.html
Text-only Language Models (Generative)
OPTForCasualLM(facebook/opt-125m) OPTForCasualLM Support (facebook/opt Series) #7440AquilaForCausalLM(Aquila, Aquila2)ArcticForCausalLM(Arctic) Followup to fix the arctic model #5768BambaForCausalLM(Bamba)BartForConditionalGeneration(BART)BloomForCausalLM(BLOOM, BLOOMZ)Cohere2ForCasualLM[Feature] Support Cohere Command-A (Cohere2ForCausalLM arch) #4570DeciLMForCausalLM(DeciLM)FalconForCausalLM(Falcon)FalconH1ForCausalLM(Falcon-H1) [Feature] Model FalconH1 #6517FalconMambaForCausalLM(FalconMamba)Dots1ForCasualLM(dots.llm1) [New model]: support dots1 model #6471GPT2LMHeadModel(GPT-2)GPTBigCodeForCausalLM(StarCoder, SantaCoder)GPTJForCausalLM(GPT-J)GPTNeoXForCausalLM(GPT-NeoX, Pythia)GraniteForCausalLM(Granite 3.0, 3.1)GraniteMoeForCausalLM(Granite 3.0 MoE)GraniteMoeHybridForCausalLM(Granite 4.0 MoE Hybrid)GraniteMoeSharedForCausalLM(Granite MoE Shared)GritLM(GritLM)InternLMForCausalLM(InternLM v1)JAISLMHeadModel(Jais)JambaForCausalLM(Jamba) [Feature] Jamba 1.5 Support PLS #1190MambaForCausalLM(Mamba)Mamba2ForCausalLM(Mamba2)MiniCPMForCausalLM(MiniCPM v1) model: minicpm-4 main model #6900MiniMaxM1ForCausalLM(MiniMax-Text) [Feature] support MiniMax #2898MiniMaxText01ForCausalLM(MiniMax-Text-01)MPTForCausalLM(MPT)NemotronForCausalLM(Nemotron-3) model: support nvidia/Llama-3_3-Nemotron-Super-49B-v1 #9067NemotronHForCausalLM(Nemotron-H)OLMoForCausalLM(OLMo v1)OLMo2ForCausalLM(OLMo2)OPTForCausalLM(OPT)OrionForCausalLM(Orion)PersimmonForCausalLM(Persimmon)PhiForCausalLM(Phi-1.5, Phi-2) Support for Phi-1.5 & Phi-2 models #7862 @ppranethPhi3SmallForCausalLM(Phi-3-Small)PhiMoEForCausalLM(Phi-3.5-MoE) Feat: Support Phi-3.5-MoE in SGLang #7907 @byjiang1996Plamo2ForCausalLM(PLaMo2)SolarForCausalLM(Solar Pro)Starcoder2ForCausalLM(Starcoder2)TeleChat2ForCausalLM(TeleChat2)TeleFLMForCausalLM(TeleFLM)Zamba2ForCausalLM(Zamba2)Embedding Models
GteModelGteNewModelModernBertModelNomicBertModelRobertaModelJambaForSequenceClassificationBertForSequenceClassificationQwen3ForSequenceClassificationsupport Qwen3ForSequenceClassification #7314RobertaForSequenceClassificationXLMRobertaForSequenceClassificationMultimodal Models
Glm4vForConditionalGeneration(THUDM/GLM-4.1V-9B-Thinking)AriaForConditionalGeneration(Aria)AyaVisionForConditionalGeneration(Aya Vision) [Model] Cohere Aya Vision #6304Blip2ForConditionalGeneration(BLIP-2) [Feature] Can you support the VLA series models? For example, openVLA. #4414ChameleonForConditionalGeneration(Chameleon)Florence2ForConditionalGeneration(Florence-2)FuyuForCausalLM(Fuyu)GLM4VForCausalLMPP support [Bug] GLM-4-32B-0414 pp supports #7257GraniteSpeechForConditionalGeneration(Granite Speech)H2OVLChatModel(H2OVL)Idefics3ForConditionalGeneration(Idefics3)LlavaNextVideoForConditionalGeneration(LLaVA-NeXT-Video) [Bug] granite-vision-3.2-2b failing on sglang with "LlavaNextForConditionalGeneration not supported" #4062MiniMaxVL01ForConditionalGeneration(MiniMax-VL)MolmoForCausalLM(Molmo)NVLM_D_Model(NVLM-D 1.0)Ovis(Ovis1.6, Ovis2) [Feature] Ovis2 surport #5018PaliGemmaForConditionalGeneration(PaliGemma)Phi3VForCausalLM(Phi-3-Vision) [Feature] Do we have any plan for supporting Phi3V? #1108PixtralForConditionalGeneration(Pixtral)Qwen2AudioForConditionalGeneration(Qwen2-Audio)Qwen2_5OmniThinkerForConditionalGeneration(Qwen2.5-Omni) model: qwen2.5 omni (thinker only) #4969SkyworkR1VChatModel(Skywork-R1V) [Feature] Skywork-R1V support #4692SmolVLMForConditionalGeneration(SmolVLM2)TarsierForConditionalGeneration(Tarsier)Tarsier2ForConditionalGeneration(Tarsier2)Related Issues & PRs