[Bug] Qwen3-VL-235B-A22B-Thinking didn't parsed the image data

### Checklist

- [x] 1. I have searched related issues but cannot get the expected help.
- [x] 2. The bug has not been fixed in the latest version.
- [x] 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
- [x] 4. If the issue you raised is not a bug but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.
- [x] 5. Please use English, otherwise it will be closed.

### Describe the bug

Qwen3-VL-235B-A22B-Thinking didn't parsed the image data

### Reproduction

```Python
import base64

import requests
from openai import OpenAI

# Modify OpenAI's API key and API base to use vLLM's API server.
openai_api_key = ""
openai_api_base = ""

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base + "/v1",
)

models = client.models.list()
model = models.data[0].id
print(model)


def encode_base64_content_from_url(content_url: str) -> str:
    """Encode a content retrieved from a remote url to base64 format."""

    with requests.get(content_url) as response:
        response.raise_for_status()
        result = base64.b64encode(response.content).decode("utf-8")

    return result


def main():

    image_url = "https://qianwen-res.oss-cn-beijing.aliyuncs.com/QVQ/demo.png"

    stream = True

    image_base64 = encode_base64_content_from_url(image_url)

    chat_completion_from_base64 = client.chat.completions.create(
        messages=[
            {
                "role": "user",
                "content": [
                    {
                        "type": "text",
                        "text": "使用中文回答，图中方框处应该是数字多少?",
                    },
                    {
                        "type": "image_url",
                        "image_url": {"url": f"data:image/jpeg;base64,{image_base64}"},
                    }
                ],
            }
        ],
        model=model,
        max_completion_tokens=2048,
        stream=stream,
    )

    if stream:
        for chunk in chat_completion_from_base64:
            if hasattr(chunk.choices[0].delta, "reasoning_content"):
                reasoning_content = chunk.choices[0].delta.reasoning_content
                if reasoning_content:
                    print(reasoning_content, end="", flush=True)
            if hasattr(chunk.choices[0].delta, "content"):
                content = chunk.choices[0].delta.content
                if content:
                    print(content, end="", flush=True)
    else:
        result = chat_completion_from_base64.choices[0].message.content
        print(result)


if __name__ == "__main__":
    main()

```

output:

```
Qwen3-VL-235B-A22B-Thinking
嗯，用户问的是图中方框处应该填什么数字，但问题是没有提供图片。这有点麻烦，因为看不到图就无法判断具体是什么题目。

用户可能是在做数学题或者逻辑题时遇到了困难，急着想找答案。ta可能没意识到需要提供图片，或者以为之前的对话里有图。这种情况下ta可能会有点着急，因为卡在题目上做不下去了。

考虑到用户特意要求用中文回答，应该是个中文使用者，可能是在校学生或者备考人士。题目类型可能是数列题、几何题或者逻辑推理题，这类题经常需要看图。

我得明确告诉ta缺图片这个关键信息，但语气要温和些，避免让ta更着急。可以给几个常见题型的例子作为提示，这样如果ta能描述题目类型可能还能帮忙分析。

对了，得提醒ta描述时注意细节：比如数字排列方式、方框位置、周围数字规律等。不过现在首要的是让ta补充图片或详细描述，不然真的无从下手。

最后加个表情缓和语气吧，毕竟用户现在可能正烦躁呢。
</think>

由于您没有提供具体的图片，我无法直接确定方框中的数字。**请补充图片或详细描述题目内容**（例如：数字排列方式、方框位置、周围已知数字等），我将为您分析解答。

---

### 常见题型参考（需根据实际题目判断）：
1. **数列规律题**  
   - 例：`2, 4, 8, □, 32` → 方框应为 **16**（等比数列，公比×2）。
2. **图形逻辑题**  
   - 例：三角形顶点数字和 = 底部数字 → 需计算关系。
3. **九宫格/矩阵题**  
   - 例：每行/列的和/积相等 → 需验证运算规则。
4. **进制转换题**  
   - 例：二进制转十进制（如 `101` = 5）。

---

### 请提供以下信息：
1. **题目类型**（数列、图形、九宫格等）；
2. **已知数字及位置**（例如：第一行：3, □, 7）；
3. **是否有运算规则提示**（如“每行和为15”）。

补充后，我将**立即给出答案和推理过程**！ 🔍  
（例如：若描述为“九宫格第一行：2, □, 4；第二行：6, 8, 10；每行和相等”，则方框=12）%                                                                       
```

### Environment

Latest version

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Qwen3-VL-235B-A22B-Thinking didn't parsed the image data #10906

Checklist

Describe the bug

Reproduction

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug] Qwen3-VL-235B-A22B-Thinking didn't parsed the image data #10906

Description

Checklist

Describe the bug

Reproduction

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions