Checklist
Describe the bug
The error occurs because SGLang is using the slots=True parameter in the @dataclass decorator in python/sglang/srt/lora/lora_registry.py at line 22:
@dataclass(frozen=True, slots=True)
class LoRARef:
However, the slots parameter for @dataclass was only introduced in Python 3.10.
(sglang-py39) [nadavt@hgn37 sglang]$ python -m sglang.bench_one_batch --model-path meta-llama/Meta-Llama-3.1-8B-Instruct --batch 32 --input-len 256 --output-len 32
Traceback (most recent call last):
File "/home/projects/dharel/nadavt/.conda/envs/sglang-py39/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/projects/dharel/nadavt/.conda/envs/sglang-py39/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/bench_one_batch.py", line 59, in <module>
from sglang.srt.configs.model_config import ModelConfig
File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/srt/configs/model_config.py", line 31, in <module>
from sglang.srt.layers.quantization import QUANTIZATION_METHODS
File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/srt/layers/quantization/__init__.py", line 44, in <module>
from sglang.srt.layers.quantization.awq import AWQConfig, AWQMarlinConfig
File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/srt/layers/quantization/awq.py", line 18, in <module>
from sglang.srt.layers.quantization.marlin_utils import (
File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/srt/layers/quantization/marlin_utils.py", line 23, in <module>
from sglang.srt.layers.quantization.utils import pack_cols, unpack_cols
File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/srt/layers/quantization/utils.py", line 13, in <module>
from sglang.srt.layers.quantization.fp8_kernel import scaled_fp8_quant
File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/srt/layers/quantization/fp8_kernel.py", line 26, in <module>
from sglang.srt.layers.quantization import deep_gemm_wrapper
File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/srt/layers/quantization/deep_gemm_wrapper/__init__.py", line 1, in <module>
from .entrypoint import *
File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/srt/layers/quantization/deep_gemm_wrapper/entrypoint.py", line 7, in <module>
from sglang.srt.layers.quantization.deep_gemm_wrapper import compile_utils
File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/srt/layers/quantization/deep_gemm_wrapper/compile_utils.py", line 14, in <module>
from sglang.srt.server_args import ServerArgs
File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/srt/server_args.py", line 26, in <module>
from sglang.srt.lora.lora_registry import LoRARef
File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/srt/lora/lora_registry.py", line 22, in <module>
@dataclass(frozen=True, slots=True)
TypeError: dataclass() got an unexpected keyword argument 'slots'
and similarly
(sglang-py39) [nadavt@hgn37 sglang]$ python3 -m sglang.bench_offline_throughput --model-path meta-llama/Meta-Llama-3.1-8B-Instruct --num-prompts 10
Traceback (most recent call last):
File "/home/projects/dharel/nadavt/.conda/envs/sglang-py39/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/projects/dharel/nadavt/.conda/envs/sglang-py39/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/bench_offline_throughput.py", line 34, in <module>
from sglang.srt.entrypoints.engine import Engine
File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/srt/entrypoints/engine.py", line 41, in <module>
from sglang.srt.managers.data_parallel_controller import (
File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/srt/managers/data_parallel_controller.py", line 28, in <module>
from sglang.srt.managers.io_struct import (
File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/srt/managers/io_struct.py", line 25, in <module>
from sglang.srt.lora.lora_registry import LoRARef
File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/srt/lora/lora_registry.py", line 22, in <module>
@dataclass(frozen=True, slots=True)
TypeError: dataclass() got an unexpected keyword argument 'slots'
Update the minimum Python requirement to 3.10+ in pyproject.toml if slots=True is essential
Reproduction
python -m sglang.bench_one_batch --model-path meta-llama/Meta-Llama-3.1-8B-Instruct --batch 32 --input-len 256 --output-len 32 or python3 -m sglang.bench_offline_throughput --model-path meta-llama/Meta-Llama-3.1-8B-Instruct --num-prompts 10 (copied from https://docs.sglang.ai/references/benchmark_and_profiling.html#benchmark) on Python 3.9.
Environment
python3 -m sglang.check_env prints
Python: 3.9.23 | packaged by conda-forge | (main, Jun 4 2025, 17:57:12) [GCC 13.3.0]
CUDA available: True
GPU 0: NVIDIA A40
GPU 0 Compute Capability: 8.6
CUDA_HOME: None
PyTorch: 2.7.1+cu126
sglang: 0.4.9.post4
sgl_kernel: 0.2.7
flashinfer_python: 0.2.9rc1
triton: 3.3.1
transformers: 4.53.2
torchao: 0.9.0
numpy: 2.0.2
aiohttp: 3.12.14
fastapi: 0.116.1
hf_transfer: 0.1.9
huggingface_hub: 0.34.1
interegular: 0.3.3
modelscope: 1.28.1
orjson: 3.11.1
outlines: 0.1.11
packaging: 25.0
psutil: 7.0.0
pydantic: 2.11.7
python-multipart: 0.0.20
pyzmq: 27.0.0
uvicorn: 0.35.0
uvloop: 0.21.0
vllm: Module Not Found
xgrammar: 0.1.21
openai: 1.97.1
tiktoken: 0.9.0
anthropic: 0.59.0
litellm: 1.74.8
decord: 0.6.0
NVIDIA Topology:
GPU0 NIC0 CPU Affinity NUMA Affinity GPU NUMA ID
GPU0 X SYS 24-47 1 N/A
NIC0 SYS X
Legend:
X = Self
SYS = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node
PHB = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
PXB = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge)
PIX = Connection traversing at most a single PCIe bridge
NV# = Connection traversing a bonded set of # NVLinks
NIC Legend:
NIC0: mlx5_0
ulimit soft: 66000
Checklist
Describe the bug
The error occurs because SGLang is using the
slots=Trueparameter in the@dataclassdecorator inpython/sglang/srt/lora/lora_registry.pyat line 22:However, the
slotsparameter for@dataclasswas only introduced in Python 3.10.and similarly
Update the minimum Python requirement to 3.10+ in
pyproject.tomlifslots=Trueis essentialReproduction
python -m sglang.bench_one_batch --model-path meta-llama/Meta-Llama-3.1-8B-Instruct --batch 32 --input-len 256 --output-len 32orpython3 -m sglang.bench_offline_throughput --model-path meta-llama/Meta-Llama-3.1-8B-Instruct --num-prompts 10(copied from https://docs.sglang.ai/references/benchmark_and_profiling.html#benchmark) on Python 3.9.Environment
python3 -m sglang.check_envprints