[LoRA][Gemma4] Support vision tower LoRA by linitra24 · Pull Request #42662 · vllm-project/vllm

linitra24 · 2026-05-14T16:37:41Z

This PR adds the remaining LoRA plumbing needed for Gemma4 multimodal LoRA support.

After #43798, Gemma4-MM vision linear layers are already converted through the Transformers backend path, so this PR no longer reimplements the Gemma4 vision tower. Instead, it focuses on the runtime LoRA mapping and token-counting pieces needed by Gemma4 image/video/audio inputs.

Main changes:

Add a multimodal LoRA token-count interface so models can report separate tower and connector token counts.
Update Gemma4-MM to report modality-specific LoRA token counts for image, video, and audio inputs.
Size multimodal LoRA wrappers using the largest tower/connector token budget across modalities.

Test Plan

Additional end-to-end tests for real Gemma4 vision LoRA adapters should also be added in a follow-up.

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

claude

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 80413b1c2b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-14T16:41:28Z

+        padding_positions: torch.Tensor,
+    ) -> torch.Tensor:
+        pixel_values = 2 * (pixel_values - 0.5)
+        hidden_states = self.input_proj(pixel_values.to(self.input_proj.weight.dtype))


Avoid reading weight on quantized linear layers

When Gemma4 is loaded with a quantization method whose LinearMethod replaces weight (for example GGUF registers qweight/qweight_type instead of weight), image or video requests will fail here before the vision tower runs because self.input_proj.weight does not exist. Since this commit now passes quant_config into the vision tower, the patch embedder should not use the vLLM linear layer's weight attribute to choose the activation dtype.

Useful? React with 👍 / 👎.

mergify · 2026-05-14T16:43:35Z

Hi @linitra24, the pre-commit checks have failed. Please run:

uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?

mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

gemini-code-assist

Code Review

This pull request replaces the vision tower in the Gemma 4 multimodal model with native vLLM modules, including custom implementations for patch embedding, pooling, and multidimensional rotary embeddings. This change enables better integration with vLLM features like LoRA. A compatibility issue was identified where the use of the "strict=True" argument in zip() would cause failures on Python 3.9, which is currently supported by vLLM.

gemini-code-assist · 2026-05-14T16:43:39Z

+            unsqueeze_dim=unsqueeze_dim,
+        )
+        for hidden_part, cos_part, sin_part in zip(
+            hidden_parts, cos_parts, sin_parts, strict=True


The strict=True argument in zip() was introduced in Python 3.10. Since vLLM supports Python 3.9, this will cause a TypeError on older Python versions. Please remove the strict=True argument.

Suggested change

hidden_parts, cos_parts, sin_parts, strict=True

hidden_parts, cos_parts, sin_parts

)

mergify · 2026-05-14T17:20:20Z

Documentation preview: https://vllm--42662.org.readthedocs.build/en/42662/

jeejeelee

To speed up this feature landing, maybe you can split the vision tower support into another PR.

…to gemma4-mm-lora

mergify · 2026-06-03T19:13:54Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @linitra24.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: bk-201 <joy25810@foxmail.com>

mergify · 2026-06-07T17:21:31Z

Documentation preview: https://vllm--42662.org.readthedocs.build/en/42662/

linitra24 added 3 commits May 14, 2026 15:10

init

ba591f7

add get_num_mm_*_tokens

137600e

Merge remote-tracking branch 'origin' into gemma4-mm-lora

80413b1

claude Bot reviewed May 14, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed May 14, 2026

View reviewed changes

gemini-code-assist Bot reviewed May 14, 2026

View reviewed changes

fix pre-commit

554629d

linitra24 mentioned this pull request May 14, 2026

[Usage]: ValueError: mismatch of LoRA layer names for Gemma4 E2B trained with unsloth #41702

Open

1 task

mergify Bot added the documentation Improvements or additions to documentation label May 14, 2026

fix

2298a6f

linitra24 changed the title ~~Gemma4 mm lora~~ [LoRA][Gemma4] Support vision tower LoRA May 14, 2026

jeejeelee self-assigned this May 15, 2026

jeejeelee reviewed May 20, 2026

View reviewed changes

jeejeelee mentioned this pull request May 20, 2026

[Feature]: Port Gemma4 vision encoder to MMEncoderAttention with FlashAttention support #43178

Open

1 task

move

f3d6db9

linitra24 mentioned this pull request May 22, 2026

[Model] Refactor Gemma4 vision tower with vLLM-native modules #43440

Closed

4 tasks

linitra24 and others added 4 commits June 2, 2026 23:47

Merge branch 'main' into gemma4-mm-lora

a0acfb7

tmp

4317ada

Merge branch 'gemma4-mm-lora' of https://github.com/linitra24/vllm in…

df0533f

…to gemma4-mm-lora

Merge branch 'gemma4-lora-old' into gemma4-mm-lora

dcd0fc7

mergify Bot added the needs-rebase label Jun 3, 2026

linitra24 added 3 commits June 4, 2026 07:10

Merge branch 'main' into gemma4-mm-lora

fadaf18

Merge branch 'main' into gemma4-mm-lora

ab9c711

support vision & audio lora

98301ba

linitra24 requested a review from njhill as a code owner June 7, 2026 15:48

mergify Bot added v1 and removed needs-rebase labels Jun 7, 2026

linitra24 added 5 commits June 7, 2026 15:52

fix

7ab1a61

Signed-off-by: bk-201 <joy25810@foxmail.com>

fix

117a864

Signed-off-by: bk-201 <joy25810@foxmail.com>

Merge branch 'main' into gemma4-mm-lora

f73c7bd

add test

181de30

doc

25cb073

linitra24 requested a review from jeejeelee June 11, 2026 14:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[LoRA][Gemma4] Support vision tower LoRA#42662

[LoRA][Gemma4] Support vision tower LoRA#42662
linitra24 wants to merge 18 commits into
vllm-project:mainfrom
linitra24:gemma4-mm-lora

linitra24 commented May 14, 2026 •

edited

Loading

Uh oh!

claude Bot left a comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 14, 2026

Uh oh!

mergify Bot commented May 14, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 14, 2026

Uh oh!

mergify Bot commented May 14, 2026

Uh oh!

jeejeelee left a comment

Uh oh!

mergify Bot commented Jun 3, 2026

Uh oh!

mergify Bot commented Jun 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	hidden_parts, cos_parts, sin_parts, strict=True
	hidden_parts, cos_parts, sin_parts
	)

Uh oh!

Conversation

linitra24 commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Plan

Test Result

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

mergify Bot commented May 14, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

mergify Bot commented May 14, 2026

Uh oh!

jeejeelee left a comment

Choose a reason for hiding this comment

Uh oh!

mergify Bot commented Jun 3, 2026

Uh oh!

mergify Bot commented Jun 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

linitra24 commented May 14, 2026 •

edited

Loading