[Bug] Qwen2.5-VL-72B image input not working in SGLang, works fine in vLLM

### Checklist

- [x] 1. I have searched related issues but cannot get the expected help.
- [x] 2. The bug has not been fixed in the latest version.
- [x] 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
- [x] 4. If the issue you raised is not a bug but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.
- [x] 5. Please use English, otherwise it will be closed.

### Describe the bug

### 🧾 Description:

When deploying `qwen2.5-vl-72b-awq` using SGLang, image inputs (via `image_url`) are not correctly handled. The same prompt works as expected in vLLM, where the model successfully describes the image.


### Reproduction

### ✅ Reproduction Steps:

#### ✅ SGLang Launch Command:
```bash
python -m sglang.launch_server \
  --model-path qwen-vl-72b \
  --port 30000 \
  --trust-remote-code \
  --host 0.0.0.0 \
  --mem-fraction-static 0.8 \
  --tp 4 \
  --tool-call-parser qwen25
```

#### ✅ OpenAI-Compatible API Call (cURL):
```bash
curl -X POST "http://0.0.0.0:30000/v1/chat/completions" \
  -H "Authorization: Bearer $DASHSCOPE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen2.5-vl",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "describe this picture"
          },
          {
            "type": "image_url",
            "image_url": {
              "url": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg"
            }
          }
        ]
      }
    ],
    "top_p": 0.8
  }'
```

---

### 🧾 SGLang Response:
```json
{
  "id": "803d3c01743b4429b61c0a83d60eda5b",
  "object": "chat.completion",
  "created": 1742528000,
  "model": "qwen2.5-vl",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "I'm sorry, but I cannot see any picture attached to your message. Could you please provide more information or upload the picture again? I'll do my best to describe it for you."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 22,
    "completion_tokens": 41,
    "total_tokens": 63
  }
}
```

---

### ✅ Comparison with vLLM:

Using the exact same model and cURL request, the image is successfully described in vLLM deployment. This confirms that the issue is not with the prompt or model, but with how SGLang handles `image_url` type content in the message payload.

---

### 📌 Expected Behavior:

SGLang should support OpenAI-compatible image inputs by correctly parsing `messages.content[].image_url.url` and feeding the image into the model’s visual encoder.


### Environment

### 🧪 Environment:

- Model: `qwen2.5-vl-72b-awq`
- Deployment: SGLang 0.4.4.post1
- API Protocol: OpenAI-compatible Chat Completions API
- vLLM Behavior: ✅ Working as expected

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Qwen2.5-VL-72B image input not working in SGLang, works fine in vLLM #4645

Checklist

Describe the bug

🧾 Description:

Reproduction

✅ Reproduction Steps:

✅ SGLang Launch Command:

✅ OpenAI-Compatible API Call (cURL):

🧾 SGLang Response:

✅ Comparison with vLLM:

📌 Expected Behavior:

Environment

🧪 Environment:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug] Qwen2.5-VL-72B image input not working in SGLang, works fine in vLLM #4645

Description

Checklist

Describe the bug

🧾 Description:

Reproduction

✅ Reproduction Steps:

✅ SGLang Launch Command:

✅ OpenAI-Compatible API Call (cURL):

🧾 SGLang Response:

✅ Comparison with vLLM:

📌 Expected Behavior:

Environment

🧪 Environment:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions