Checklist
Describe the bug
I tried to start sglang to serve a quantized model DeepSeek-R1-Distill-Qwen-32B-Int4-W4A16 ,using docker image lmsysorg/sglang:v0.4.5-cu124 . Then the error occured with log "NameError: name 'VLLM_AVAILABLE' is not defined" . After reading the code, I found that in the file compressed_tensors.py , there is no declaration of VLLM_AVAILABLE, while in the file compressed_tensors_moe.py , there is the snippet
try:
import vllm
VLLM_AVAILABLE = True
except ImportError:
VLLM_AVAILABLE = False
And this error did not happen when using sglang:v0.4.5-cu124 sglang:v0.4.4-cu124 (fix typo)
Reproduction
model
https://modelscope.cn/models/okwinds/DeepSeek-R1-Distill-Qwen-32B-Int4-W4A16
docker image
lmsysorg/sglang:v0.4.5-cu124
Environment
Ubuntu 22.04
CUDA 12.4 with driver version 530
GPU A800
Checklist
Describe the bug
I tried to start sglang to serve a quantized model DeepSeek-R1-Distill-Qwen-32B-Int4-W4A16 ,using docker image lmsysorg/sglang:v0.4.5-cu124 . Then the error occured with log "NameError: name 'VLLM_AVAILABLE' is not defined" . After reading the code, I found that in the file compressed_tensors.py , there is no declaration of
VLLM_AVAILABLE, while in the file compressed_tensors_moe.py , there is the snippetAnd this error did not happen when using
sglang:v0.4.5-cu124sglang:v0.4.4-cu124 (fix typo)Reproduction
model
https://modelscope.cn/models/okwinds/DeepSeek-R1-Distill-Qwen-32B-Int4-W4A16
docker image
lmsysorg/sglang:v0.4.5-cu124
Environment
Ubuntu 22.04
CUDA 12.4 with driver version 530
GPU A800