Skip to content

[Bug] Python 3.9 compatibility broken due to dataclass(slots=True) usage #8389

@keyboardAnt

Description

@keyboardAnt

Checklist

  • 1. I have searched related issues but cannot get the expected help.
  • 2. The bug has not been fixed in the latest version.
  • 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
  • 4. If the issue you raised is not a bug but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.
  • 5. Please use English, otherwise it will be closed.

Describe the bug

The error occurs because SGLang is using the slots=True parameter in the @dataclass decorator in python/sglang/srt/lora/lora_registry.py at line 22:

@dataclass(frozen=True, slots=True)
class LoRARef:

However, the slots parameter for @dataclass was only introduced in Python 3.10.

(sglang-py39) [nadavt@hgn37 sglang]$ python -m sglang.bench_one_batch --model-path meta-llama/Meta-Llama-3.1-8B-Instruct --batch 32 --input-len 256 --output-len 32
Traceback (most recent call last):
  File "/home/projects/dharel/nadavt/.conda/envs/sglang-py39/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/projects/dharel/nadavt/.conda/envs/sglang-py39/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/bench_one_batch.py", line 59, in <module>
    from sglang.srt.configs.model_config import ModelConfig
  File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/srt/configs/model_config.py", line 31, in <module>
    from sglang.srt.layers.quantization import QUANTIZATION_METHODS
  File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/srt/layers/quantization/__init__.py", line 44, in <module>
    from sglang.srt.layers.quantization.awq import AWQConfig, AWQMarlinConfig
  File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/srt/layers/quantization/awq.py", line 18, in <module>
    from sglang.srt.layers.quantization.marlin_utils import (
  File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/srt/layers/quantization/marlin_utils.py", line 23, in <module>
    from sglang.srt.layers.quantization.utils import pack_cols, unpack_cols
  File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/srt/layers/quantization/utils.py", line 13, in <module>
    from sglang.srt.layers.quantization.fp8_kernel import scaled_fp8_quant
  File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/srt/layers/quantization/fp8_kernel.py", line 26, in <module>
    from sglang.srt.layers.quantization import deep_gemm_wrapper
  File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/srt/layers/quantization/deep_gemm_wrapper/__init__.py", line 1, in <module>
    from .entrypoint import *
  File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/srt/layers/quantization/deep_gemm_wrapper/entrypoint.py", line 7, in <module>
    from sglang.srt.layers.quantization.deep_gemm_wrapper import compile_utils
  File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/srt/layers/quantization/deep_gemm_wrapper/compile_utils.py", line 14, in <module>
    from sglang.srt.server_args import ServerArgs
  File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/srt/server_args.py", line 26, in <module>
    from sglang.srt.lora.lora_registry import LoRARef
  File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/srt/lora/lora_registry.py", line 22, in <module>
    @dataclass(frozen=True, slots=True)
TypeError: dataclass() got an unexpected keyword argument 'slots'

and similarly

(sglang-py39) [nadavt@hgn37 sglang]$ python3 -m sglang.bench_offline_throughput --model-path meta-llama/Meta-Llama-3.1-8B-Instruct --num-prompts 10
Traceback (most recent call last):
  File "/home/projects/dharel/nadavt/.conda/envs/sglang-py39/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/projects/dharel/nadavt/.conda/envs/sglang-py39/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/bench_offline_throughput.py", line 34, in <module>
    from sglang.srt.entrypoints.engine import Engine
  File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/srt/entrypoints/engine.py", line 41, in <module>
    from sglang.srt.managers.data_parallel_controller import (
  File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/srt/managers/data_parallel_controller.py", line 28, in <module>
    from sglang.srt.managers.io_struct import (
  File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/srt/managers/io_struct.py", line 25, in <module>
    from sglang.srt.lora.lora_registry import LoRARef
  File "/home/projects/dharel/nadavt/repos/sglang/python/sglang/srt/lora/lora_registry.py", line 22, in <module>
    @dataclass(frozen=True, slots=True)
TypeError: dataclass() got an unexpected keyword argument 'slots'

Update the minimum Python requirement to 3.10+ in pyproject.toml if slots=True is essential

Reproduction

python -m sglang.bench_one_batch --model-path meta-llama/Meta-Llama-3.1-8B-Instruct --batch 32 --input-len 256 --output-len 32 or python3 -m sglang.bench_offline_throughput --model-path meta-llama/Meta-Llama-3.1-8B-Instruct --num-prompts 10 (copied from https://docs.sglang.ai/references/benchmark_and_profiling.html#benchmark) on Python 3.9.

Environment

python3 -m sglang.check_env prints

Python: 3.9.23 | packaged by conda-forge | (main, Jun  4 2025, 17:57:12) [GCC 13.3.0]
CUDA available: True
GPU 0: NVIDIA A40
GPU 0 Compute Capability: 8.6
CUDA_HOME: None
PyTorch: 2.7.1+cu126
sglang: 0.4.9.post4
sgl_kernel: 0.2.7
flashinfer_python: 0.2.9rc1
triton: 3.3.1
transformers: 4.53.2
torchao: 0.9.0
numpy: 2.0.2
aiohttp: 3.12.14
fastapi: 0.116.1
hf_transfer: 0.1.9
huggingface_hub: 0.34.1
interegular: 0.3.3
modelscope: 1.28.1
orjson: 3.11.1
outlines: 0.1.11
packaging: 25.0
psutil: 7.0.0
pydantic: 2.11.7
python-multipart: 0.0.20
pyzmq: 27.0.0
uvicorn: 0.35.0
uvloop: 0.21.0
vllm: Module Not Found
xgrammar: 0.1.21
openai: 1.97.1
tiktoken: 0.9.0
anthropic: 0.59.0
litellm: 1.74.8
decord: 0.6.0
NVIDIA Topology: 
        GPU0    NIC0    CPU Affinity    NUMA Affinity   GPU NUMA ID
GPU0     X      SYS     24-47   1               N/A
NIC0    SYS      X 

Legend:

  X    = Self
  SYS  = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
  NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node
  PHB  = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
  PXB  = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge)
  PIX  = Connection traversing at most a single PCIe bridge
  NV#  = Connection traversing a bonded set of # NVLinks

NIC Legend:

  NIC0: mlx5_0


ulimit soft: 66000

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions