[Bugfix] Fix Qwen3.5 Marlin TP failure for GDN in_proj_ba by AjAnubolu · Pull Request #36199 · vllm-project/vllm

AjAnubolu · 2026-03-06T02:39:12Z

Summary

Split the GDN in_proj_ba linear into separate in_proj_b and in_proj_a
so each column dimension meets Marlin's MIN_THREAD_N=64 constraint at TP>=4.

The in_proj_ba linear layer has output dim = 2 * num_kv_heads which can be < GPTQ_MARLIN_MIN_THREAD_N (64) when sharded. Use disable_tp and quant_config=None for this layer, then manually slice b/a for the local TP rank. Fixes vllm-project#35924 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: AjAnubolu <anuboluajay@gmail.com>

gemini-code-assist

Code Review

This pull request addresses a Tensor Parallelism failure for Qwen3.5 with Marlin quantization. The fix involves disabling tensor parallelism for the in_proj_ba layer in the Gated Delta Network, which is too small to be sharded correctly with Marlin's constraints. Instead, the layer is replicated, and its output is manually sliced per TP rank. The changes in qwen3_5.py and qwen3_next.py are consistent with this approach. However, I've found a critical issue in qwen3_next.py where a reshape operation is incorrect for sequence lengths greater than 1, which will likely cause a runtime error during prefill.

gemini-code-assist · 2026-03-06T02:44:07Z

+        b = b.reshape(b.size(0), self.num_v_heads)
+        a = a.reshape(a.size(0), self.num_v_heads)


The reshape operation for b and a appears to be incorrect for sequence lengths (sq) greater than 1. The tensor b has a shape of (bs, sq, num_k_heads, num_v_heads // num_k_heads), but it's being reshaped to (bs, self.num_v_heads). This will raise a runtime error during prefill when sq > 1 because the number of elements won't match.

The reshape should probably flatten the batch and sequence dimensions (bs and sq) to get a tensor with shape (num_tokens, num_v_heads).

Suggested change

b = b.reshape(b.size(0), self.num_v_heads)

a = a.reshape(a.size(0), self.num_v_heads)

b = b.reshape(-1, self.num_v_heads)

a = a.reshape(-1, self.num_v_heads)

Signed-off-by: AjAnubolu <anuboluajay@gmail.com>

mergify · 2026-03-10T02:25:29Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @AjAnubolu.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

mergify · 2026-04-23T06:09:13Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @AjAnubolu.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

mergify · 2026-05-23T06:52:40Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @AjAnubolu.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

AjAnubolu requested a review from sighingnow as a code owner March 6, 2026 02:39

mergify Bot added qwen Related to Qwen models bug Something isn't working labels Mar 6, 2026

gemini-code-assist Bot reviewed Mar 6, 2026

View reviewed changes

Fix reshape for sq > 1 during prefill

203929a

Signed-off-by: AjAnubolu <anuboluajay@gmail.com>

sonusflow mentioned this pull request Mar 7, 2026

[Bugfix] Fix Qwen3.5 GatedDeltaNet in_proj_ba Marlin failure at TP>=2 #36329

Merged

8 tasks

mergify Bot added the needs-rebase label Mar 10, 2026

mergify Bot removed the needs-rebase label Apr 23, 2026

mergify Bot added the needs-rebase label Apr 23, 2026

mergify Bot removed the needs-rebase label May 23, 2026

mergify Bot added the needs-rebase label May 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugfix] Fix Qwen3.5 Marlin TP failure for GDN in_proj_ba#36199

[Bugfix] Fix Qwen3.5 Marlin TP failure for GDN in_proj_ba#36199
AjAnubolu wants to merge 2 commits into
vllm-project:mainfrom
AjAnubolu:fix/qwen35-marlin-tp-35924

AjAnubolu commented Mar 6, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Mar 6, 2026

Uh oh!

mergify Bot commented Mar 10, 2026

Uh oh!

mergify Bot commented Apr 23, 2026

Uh oh!

mergify Bot commented May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		b = b.reshape(b.size(0), self.num_v_heads)
		a = a.reshape(a.size(0), self.num_v_heads)

Uh oh!

Conversation

AjAnubolu commented Mar 6, 2026

Summary

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

mergify Bot commented Mar 10, 2026

Uh oh!

mergify Bot commented Apr 23, 2026

Uh oh!

mergify Bot commented May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant