[Feature] Support minicpmv v2.6 by mickqian · Pull Request #2785 · sgl-project/sglang

mickqian · 2025-01-08T06:59:43Z

Motivation

Addressing #2461

Modifications

Add new model MiniCPMV and corresponding processor MiniCPMVImageProcessor
Register new chat and conversation templates

Checklist

Format your code according to the Contributor Guide.
Add unit tests as outlined in the Contributor Guide.
Update documentation as needed, including docstrings or example tutorials.

mickqian · 2025-01-09T04:41:21Z

Questions

From my understanding of current implementation of sglang, it does not support two kind of attentions at the same time, thus the attention module in Resampler can't be replaced with RadixAttention. Is that correct?

zhaochenyang20 · 2025-01-09T04:54:51Z

Questions

From my understanding on current implementation of sglang, it does not support two kind of attentions at the same time, thus the attention module in Resampler and Idefics2VisionAttention can't be replaced with RadixAttention. Is that correct?

Current implementation of Idefics2VisionAttention in this PR relies on vllm.attention.MultiheadAttention, which is introduced in version vllm==0.6.5. If that is not acceptable, I will rewrite the attention module

@Ying1123 @merrymercy

zhaochenyang20 · 2025-01-10T06:20:51Z

@mickqian Looks nice! Are there anything left to do?

mickqian · 2025-01-10T06:24:30Z

@mickqian Looks nice! Are there anything left to do?看起来不错！还有什么要做的吗？

The number of frames processed from the video is hard-coded for the moment, which should be calculated. I added a TODO on it.
A CI Test failed because of this:

[2025-01-10 03:24:51 TP0] Multimodal prompt is too long after expanding multimodal tokens. After expanding len(req.origin_input_ids_unpadded)=29 => 308 >= 294.

from an assertion of checking the input_ids' length. The token limit (from my understanding) is calculated from the GPU:

 if len(req.origin_input_ids) >= self.max_req_input_len:
        logger.error(
            "Multimodal prompt is too long after expanding multimodal tokens. "
            f"After expanding {len(req.origin_input_ids_unpadded)=} => {len(req.origin_input_ids)} >= {self.max_req_input_len}. "
                )

Some mp4 file failed the decode process in decode_video_base64

zhaochenyang20 · 2025-01-11T23:34:37Z

@hnyls2002 Could you help to review. Thanks!

zhaochenyang20 · 2025-01-17T15:04:12Z

@mickqian Thanks. We will update the API key as soon as possible. Before that, could you fix the conflicts of qwen2_vl.py.

yizhang2077 · 2025-01-18T05:50:26Z

LGTM cc @merrymercy

merrymercy

Can you add some unit tests like this to compare the logtis against HF implementation? https://github.com/sgl-project/sglang/pull/2365/files#diff-6e52783df34170e0b3d9aadbd7c338d9c16f0303c18846afc1d298c32d4a4eb2R1
Can you help update the docs here for VLM?

sglang/docs/references/supported_models.md

Line 50 in 83452db

## How to Support a New Model

merrymercy · 2025-01-19T03:01:12Z

+from vllm.distributed import divide, get_tensor_model_parallel_world_size
+from vllm.model_executor.layers.resampler import get_2d_sincos_pos_embed
+from vllm.model_executor.layers.sampler import SamplerOutput, get_sampler
+from vllm.model_executor.models.module_mapping import MultiModelKeys
+from vllm.model_executor.sampling_metadata import SamplingMetadata


Do not import anything from vllm.

SamplingMetadata, SamplerOutput are not used.

Use sglang.srt.distributed

copy over small utility functions.

zhaochenyang20 · 2025-01-19T04:29:50Z

@mickqian Also, remove Chinese comments in the PR. Thanks so much. Could we try not import any vllm dependency, rather rewrite it ourselves.

zhaochenyang20 · 2025-01-19T04:30:34Z

Under the models files, we prefer not to import anything from vllm. We will remove them all later.

mickqian · 2025-01-19T07:52:36Z

Trying to address these in #2977

Co-authored-by: Chayenne <zhaochen20@outlook.com> Co-authored-by: yizhang2077 <1109276519@qq.com>

mickqian changed the title ~~[WIP] [Feature] Support minicpmv v2.6~~ [Feature] Support minicpmv v2.6 Jan 9, 2025

mickqian marked this pull request as ready for review January 9, 2025 04:42

mickqian requested review from ByronHsu, Ying1123, hnyls2002, ispobock, merrymercy and zhyncs as code owners January 9, 2025 04:42

mickqian force-pushed the minicpmv branch 14 times, most recently from 42f09c0 to c2abb42 Compare January 10, 2025 03:01

mickqian force-pushed the minicpmv branch 3 times, most recently from aef5497 to fcf2ddd Compare January 11, 2025 13:50

mickqian force-pushed the minicpmv branch 9 times, most recently from 424c100 to 001d017 Compare January 17, 2025 10:08

mickqian force-pushed the minicpmv branch 3 times, most recently from 36d213e to a382656 Compare January 17, 2025 16:39

[Feature] Support minicpmv v2.6

d8501d4

mickqian force-pushed the minicpmv branch from a382656 to d8501d4 Compare January 17, 2025 16:58

zhaochenyang20 and others added 2 commits January 17, 2025 16:21

Merge branch 'main' into minicpmv

2d491ae

Merge branch 'main' into minicpmv

9b09ac5

yizhang2077 self-requested a review January 18, 2025 06:04

yizhang2077 approved these changes Jan 18, 2025

View reviewed changes

Merge branch 'main' into minicpmv

a662445

zhaochenyang20 merged commit 3d93f84 into sgl-project:main Jan 18, 2025

merrymercy reviewed Jan 19, 2025

View reviewed changes

mickqian mentioned this pull request Jan 19, 2025

[Fix] Address remaining issues of supporting MiniCPMV #2977

Merged

3 tasks

zhaochenyang20 mentioned this pull request Jan 24, 2025

[Feature] Support InterVL #3092

Closed

2 tasks

timethink pushed a commit to timethink/sglang that referenced this pull request Mar 9, 2025

[Feature] Support minicpmv v2.6 (sgl-project#2785)

3dc1e01

Co-authored-by: Chayenne <zhaochen20@outlook.com> Co-authored-by: yizhang2077 <1109276519@qq.com>

vhain mentioned this pull request Mar 27, 2025

deps: lazy import optional dependencies gguf and torchvision #4826

Merged

6 tasks

Conversation

mickqian commented Jan 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Checklist

Uh oh!

mickqian commented Jan 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Questions

Uh oh!

zhaochenyang20 commented Jan 9, 2025

Questions

Uh oh!

zhaochenyang20 commented Jan 10, 2025

Uh oh!

mickqian commented Jan 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zhaochenyang20 commented Jan 11, 2025

Uh oh!

zhaochenyang20 commented Jan 17, 2025

Uh oh!

yizhang2077 commented Jan 18, 2025

Uh oh!

merrymercy left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

merrymercy Jan 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

zhaochenyang20 commented Jan 19, 2025

Uh oh!

zhaochenyang20 commented Jan 19, 2025

Uh oh!

mickqian commented Jan 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mickqian commented Jan 8, 2025 •

edited

Loading

mickqian commented Jan 9, 2025 •

edited

Loading

mickqian commented Jan 10, 2025 •

edited

Loading