Skip to content

[Bug] Error when using qwen2.5-vl to run token_in_token_out_vlm_engine.py #5164

@xuyifan-0731

Description

@xuyifan-0731

Checklist

  • 1. I have searched related issues but cannot get the expected help.
  • 2. The bug has not been fixed in the latest version.
  • 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
  • 4. If the issue you raised is not a bug but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.
  • 5. Please use English, otherwise it will be closed.

Describe the bug

I'm trying to call the Qwen2.5-VL engine using the token_in method. I directly ran the script examples/runtime/token_in_token_out/token_in_token_out_vlm_engine.py (with only the checkpoint path modified), but encountered the following error:

Traceback (most recent call last):
  File "/workspace/xuyifan/verl/sglang_test2.py", line 78, in <module>
    token_in_out_example(server_args)
  File "/workspace/xuyifan/verl/sglang_test2.py", line 54, in token_in_out_example
    output = backend.generate(
  File "/workspace/xuyifan/sglang/python/sglang/srt/entrypoints/engine.py", line 183, in generate
    ret = loop.run_until_complete(generator.__anext__())
  File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
  File "/workspace/xuyifan/sglang/python/sglang/srt/managers/tokenizer_manager.py", line 363, in generate_request
    tokenized_obj = await self._tokenize_one_request(obj)
  File "/workspace/xuyifan/sglang/python/sglang/srt/managers/tokenizer_manager.py", line 401, in _tokenize_one_request
    image_inputs: Dict = await self.mm_processor.process_mm_data_async(
  File "/workspace/xuyifan/sglang/python/sglang/srt/managers/multimodal_processors/qwen_vl.py", line 122, in process_mm_data_async
    resize_tasks = [resize_image_async(image) for image in base_output.images]
TypeError: 'NoneType' object is not iterable

After some debugging, I found that the issue likely stems from the following code in sglang/srt/managers/multimodal_processors/base_processor.py at line 171:

for index, text_part in enumerate(text_parts):
    print(f"================index: {index}, text_part: {text_part}")
    try:
        if text_part == multimodal_tokens.image_token:
            # load as image

In this case, text_parts is a list of length 1, and its content is:

<|vision_start|><|image_pad|><|image_pad|>...<|image_pad|><|vision_end|>What is in this picture?

As a result, the function fails to extract any images, and base_output.images ends up being None, which causes the TypeError.

Is there a recommended solution or workaround for this issue?

Reproduction

runing examples/runtime/token_in_token_out/token_in_token_out_vlm_engine.py with qwen2.5-vl-3b-instruct

Environment

Python: 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]
CUDA available: True
GPU 0,1,2,3,4,5,6,7: NVIDIA H800
GPU 0,1,2,3,4,5,6,7 Compute Capability: 9.0
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 12.3, V12.3.107
CUDA Driver Version: 535.54.03
PyTorch: 2.6.0+cu124
sglang: 0.4.4.post3
sgl_kernel: 0.0.5.post3
flashinfer: Module Not Found
triton: 3.2.0
transformers: 4.49.0.dev0
torchao: 0.9.0
numpy: 1.26.4
aiohttp: 3.9.1
fastapi: 0.115.5
hf_transfer: 0.1.9
huggingface_hub: 0.26.2
interegular: 0.3.3
modelscope: 1.23.1
orjson: 3.10.11
outlines: 0.1.11
packaging: 23.2
psutil: 5.9.4
pydantic: 2.10.5
multipart: Module Not Found
zmq: Module Not Found
uvicorn: 0.22.0
uvloop: 0.21.0
vllm: 0.7.2
xgrammar: 0.1.14
openai: 1.59.6
tiktoken: 0.7.0
anthropic: 0.49.0
litellm: 1.57.4
decord: 0.6.0
NVIDIA Topology:
GPU0 GPU1 GPU2 GPU3 GPU4 GPU5 GPU6 GPU7 NIC0 NIC1 NIC2 NIC3 NIC4 NIC5 NIC6 NIC7 CPU Affinity NUMA Affinity GPU NUMA ID
GPU0 X NV8 NV8 NV8 NV8 NV8 NV8 NV8 PIX NODE NODE NODE SYS SYS SYSSYS 0-85 0 N/A
GPU1 NV8 X NV8 NV8 NV8 NV8 NV8 NV8 NODE PIX NODE NODE SYS SYS SYSSYS 0-85 0 N/A
GPU2 NV8 NV8 X NV8 NV8 NV8 NV8 NV8 NODE NODE PIX NODE SYS SYS SYSSYS 0-85 0 N/A
GPU3 NV8 NV8 NV8 X NV8 NV8 NV8 NV8 NODE NODE NODE PIX SYS SYS SYSSYS 0-85 0 N/A
GPU4 NV8 NV8 NV8 NV8 X NV8 NV8 NV8 SYS SYS SYS SYS PIX NODE NODE NODE 86-171 1 N/A
GPU5 NV8 NV8 NV8 NV8 NV8 X NV8 NV8 SYS SYS SYS SYS NODE PIX NODE NODE 86-171 1 N/A
GPU6 NV8 NV8 NV8 NV8 NV8 NV8 X NV8 SYS SYS SYS SYS NODE NODE PIXNODE 86-171 1 N/A
GPU7 NV8 NV8 NV8 NV8 NV8 NV8 NV8 X SYS SYS SYS SYS NODE NODE NODE PIX 86-171 1 N/A
NIC0 PIX NODE NODE NODE SYS SYS SYS SYS X NODE NODE NODE SYS SYS SYSSYS
NIC1 NODE PIX NODE NODE SYS SYS SYS SYS NODE X NODE NODE SYS SYS SYSSYS
NIC2 NODE NODE PIX NODE SYS SYS SYS SYS NODE NODE X NODE SYS SYS SYSSYS
NIC3 NODE NODE NODE PIX SYS SYS SYS SYS NODE NODE NODE X SYS SYS SYSSYS
NIC4 SYS SYS SYS SYS PIX NODE NODE NODE SYS SYS SYS SYS X NODE NODE NODE
NIC5 SYS SYS SYS SYS NODE PIX NODE NODE SYS SYS SYS SYS NODE X NODE NODE
NIC6 SYS SYS SYS SYS NODE NODE PIX NODE SYS SYS SYS SYS NODE NODE X NODE
NIC7 SYS SYS SYS SYS NODE NODE NODE PIX SYS SYS SYS SYS NODE NODE NODE X

Legend:

X = Self
SYS = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node
PHB = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
PXB = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge)
PIX = Connection traversing at most a single PCIe bridge
NV# = Connection traversing a bonded set of # NVLinks

NIC Legend:

NIC0: mlx5_bond_0
NIC1: mlx5_bond_1
NIC2: mlx5_bond_2
NIC3: mlx5_bond_3
NIC4: mlx5_bond_4
NIC5: mlx5_bond_5
NIC6: mlx5_bond_6
NIC7: mlx5_bond_7

Hypervisor vendor: KVM
ulimit soft: 1048576

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions