Skip to content

[model] support MiniCPM-V 4.0#8747

Merged
JustinTong0323 merged 19 commits intosgl-project:mainfrom
tc-mb:support-MiniCPM-V4.0
Sep 2, 2025
Merged

[model] support MiniCPM-V 4.0#8747
JustinTong0323 merged 19 commits intosgl-project:mainfrom
tc-mb:support-MiniCPM-V4.0

Conversation

@tc-mb
Copy link
Copy Markdown
Contributor

@tc-mb tc-mb commented Aug 4, 2025

Motivation

Modifications

Accuracy Test

Benchmark & Profiling

Checklist

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @tc-mb, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces support for the MiniCPM-V 4.0 model, expanding the framework's capabilities to include this new multimodal architecture. It integrates language and vision components and defines how modules are handled for optimization techniques like LoRA and quantization.

Highlights

  • New Model Support: Added support for the MiniCPM-V 4.0 model by introducing a new MiniCPMV4_0 class, encapsulating the architecture and processing logic for this version.
  • LLM and Vision Module Integration: The MiniCPMV4_0 implementation integrates LlamaForCausalLM as its language model and utilizes Idefics2VisionTransformer with Resampler2_5 for vision processing.
  • Multimodal Input Handling: Implemented methods for handling multimodal inputs, such as get_vision_embedding, get_image_feature, and pad_input_ids within the MiniCPMV4_0 class.
  • Quantization and LoRA Mappings: Defined packed_modules_mapping, supported_lora_modules, and bitsandbytes_stacked_params_mapping within MiniCPMV4_0 to support LoRA fine-tuning and BitandBytes quantization.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for the MiniCPM-V 4.0 model. The implementation introduces a new class MiniCPMV4_0 which is similar to the existing MiniCPMV2_6. My review focuses on the maintainability and design of this new class. I've identified significant code duplication between the two version-specific classes and a violation of the Liskov Substitution Principle in the class hierarchy. I've provided detailed comments on these issues and suggested refactoring to improve the code quality.


return pattern.pad_input_tokens(input_ids, image_inputs)

class MiniCPMV4_0(MiniCPMBaseModel):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The new class MiniCPMV4_0 duplicates code from MiniCPMV2_6. The following are identical:

  • Class attributes: packed_modules_mapping, supported_lora_modules, bitsandbytes_stacked_params_mapping, embedding_modules, embedding_padding_modules.
  • Methods: init_vision_module, init_resampler, get_vision_embedding, get_image_feature, pad_input_ids.

This duplication reduces maintainability. Refactor common logic into a shared base class (e.g., MiniCPMBase) from which MiniCPMV2_6 and MiniCPMV4_0 inherit. Child classes would then implement version-specific logic, such as __init__ and init_llm.

Comment thread python/sglang/srt/models/minicpmv.py
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@JustinTong0323
Copy link
Copy Markdown
Collaborator

Could you use this model to replace the old minicpmv test? Should be under test/srt/test_vision_xxxxx

Copy link
Copy Markdown
Collaborator

@JustinTong0323 JustinTong0323 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for you contribution! May you resolve the comments first?

Comment thread python/sglang/srt/models/minicpmv.py Outdated
@JustinTong0323
Copy link
Copy Markdown
Collaborator

JustinTong0323 commented Aug 6, 2025

Your version of sglang appears to be outdated. It is recommended that you attempt to merge the main branch first.

@tc-mb
Copy link
Copy Markdown
Contributor Author

tc-mb commented Aug 6, 2025

Your version of sglang appears to be outdated. It is recommended that you attempt to merge the main branch first.

Your 3 revision suggestions have been modified. ^_^

@JustinTong0323 JustinTong0323 merged commit 03dbf1a into sgl-project:main Sep 2, 2025
128 of 135 checks passed
@tc-mb tc-mb deleted the support-MiniCPM-V4.0 branch September 3, 2025 03:31
MahmoudAshraf97 pushed a commit to MahmoudAshraf97/sglang that referenced this pull request Sep 8, 2025
Signed-off-by: tc-mb <caitianchi@modelbest.cn>
Co-authored-by: Xinyuan Tong <115166877+JustinTong0323@users.noreply.github.com>
@ShangmingCai ShangmingCai mentioned this pull request Sep 23, 2025
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants