Checklist
Describe the bug
I m trying to run the benchmark of lora benchmark/lora/lora_bench.py, but I met the error message ImportError: cannot import name 'AIOHTTP_TIMEOUT' from 'sglang.bench_serving, I found that there does not exist the var AIOHTTP_TIMEOUT.
Is this variable deprecated in the latest version?
Reproduction
Just run the benchmark benchmark/lora/lora_bench.py
Environment
Python: 3.10.19 (main, Oct 21 2025, 16:43:05) [GCC 11.2.0]
CUDA available: True
GPU 0,1,2,3: NVIDIA GeForce RTX 4090
GPU 0,1,2,3 Compute Capability: 8.9
CUDA_HOME: /usr/local/cuda-12.8
NVCC: Cuda compilation tools, release 12.8, V12.8.61
CUDA Driver Version: 580.95.05
PyTorch: 2.8.0+cu128
sglang: 0.5.5
sgl_kernel: 0.3.16.post5
flashinfer_python: 0.5.0
flashinfer_cubin: 0.5.0
flashinfer_jit_cache: Module Not Found
triton: 3.4.0
transformers: 4.57.1
torchao: 0.9.0
numpy: 2.2.6
aiohttp: 3.13.2
fastapi: 0.120.3
hf_transfer: 0.1.9
huggingface_hub: 0.36.0
interegular: 0.3.3
modelscope: 1.31.0
orjson: 3.11.4
outlines: 0.1.11
packaging: 25.0
psutil: 7.1.2
pydantic: 2.12.3
python-multipart: 0.0.20
pyzmq: 27.1.0
uvicorn: 0.38.0
uvloop: 0.21.0
vllm: Module Not Found
xgrammar: 0.1.25
openai: 2.6.1
tiktoken: 0.12.0
anthropic: 0.72.0
litellm: Module Not Found
decord2: 2.0.0
NVIDIA Topology:
GPU0 GPU1 GPU2 GPU3 CPU Affinity NUMA Affinity GPU NUMA ID
GPU0 X NODE NODE NODE 0-71 0 N/A
GPU1 NODE X NODE NODE 0-71 0 N/A
GPU2 NODE NODE X NODE 0-71 0 N/A
GPU3 NODE NODE NODE X 0-71 0 N/A
Legend:
X = Self
SYS = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node
PHB = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
PXB = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge)
PIX = Connection traversing at most a single PCIe bridge
NV# = Connection traversing a bonded set of # NVLinks
ulimit soft: 1048576
Checklist
Describe the bug
I m trying to run the benchmark of lora
benchmark/lora/lora_bench.py, but I met the error messageImportError: cannot import name 'AIOHTTP_TIMEOUT' from 'sglang.bench_serving, I found that there does not exist the varAIOHTTP_TIMEOUT.Is this variable deprecated in the latest version?
Reproduction
Just run the benchmark
benchmark/lora/lora_bench.pyEnvironment
Python: 3.10.19 (main, Oct 21 2025, 16:43:05) [GCC 11.2.0]
CUDA available: True
GPU 0,1,2,3: NVIDIA GeForce RTX 4090
GPU 0,1,2,3 Compute Capability: 8.9
CUDA_HOME: /usr/local/cuda-12.8
NVCC: Cuda compilation tools, release 12.8, V12.8.61
CUDA Driver Version: 580.95.05
PyTorch: 2.8.0+cu128
sglang: 0.5.5
sgl_kernel: 0.3.16.post5
flashinfer_python: 0.5.0
flashinfer_cubin: 0.5.0
flashinfer_jit_cache: Module Not Found
triton: 3.4.0
transformers: 4.57.1
torchao: 0.9.0
numpy: 2.2.6
aiohttp: 3.13.2
fastapi: 0.120.3
hf_transfer: 0.1.9
huggingface_hub: 0.36.0
interegular: 0.3.3
modelscope: 1.31.0
orjson: 3.11.4
outlines: 0.1.11
packaging: 25.0
psutil: 7.1.2
pydantic: 2.12.3
python-multipart: 0.0.20
pyzmq: 27.1.0
uvicorn: 0.38.0
uvloop: 0.21.0
vllm: Module Not Found
xgrammar: 0.1.25
openai: 2.6.1
tiktoken: 0.12.0
anthropic: 0.72.0
litellm: Module Not Found
decord2: 2.0.0
NVIDIA Topology:
GPU0 GPU1 GPU2 GPU3 CPU Affinity NUMA Affinity GPU NUMA ID
GPU0 X NODE NODE NODE 0-71 0 N/A
GPU1 NODE X NODE NODE 0-71 0 N/A
GPU2 NODE NODE X NODE 0-71 0 N/A
GPU3 NODE NODE NODE X 0-71 0 N/A
Legend:
X = Self
SYS = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node
PHB = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
PXB = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge)
PIX = Connection traversing at most a single PCIe bridge
NV# = Connection traversing a bonded set of # NVLinks
ulimit soft: 1048576