[Bugfix] [Model] Missing MRoPE function definition from `KeyeForConditionalGeneration` by tjtanaa · Pull Request #27895 · vllm-project/vllm

tjtanaa · 2025-10-31T19:22:03Z

Purpose

This is a bugfix for Kwai-Keye/Keye-VL-8B-Preview as an effort to complete the unit test for upcoming RFC ViT Attention Reorganization. The fix will be used as validation of RFC correctness.

Ever since the shift of get_mrope_input_positions role from the GPU runner into model definition file. This model is broken as it is missing get_mrope_input_positions and SupportsMRoPE.

I am not exactly sure how the exact get_mrope_input_positions would look like. I am referring to KeyeVL1_5ForConditionalGeneration's get_mrope_input_positions implementation.

I have also made a ROCm specific bugfix. But the major issue is the model is broken.

CC model author of PR #20126 , @Kwai-Keye .

Test Plan

Add a new unit test and ensure it pass the simple unit test where it outputs sensible data tests/models/multimodal/generation/test_keye.py

ChartQA lm eval

Test Result

--------------------------------------------------                                                                                         [6/1934]
<analysis>This question asks for the content of each image, which is straightforward and asks for a direct observation. Therefore, /no_think mode i
s more appropriate.</analysis>The first image depicts a street scene in what appears to be a Chinatown area. There is a prominent red stop sign in 
the foreground, and behind it, a traditional Chinese archway with red pillars and decorative elements. The archway has Chinese characters on it, an
d there are stone lion statues flanking the entrance. The area seems to be a commercial district with various shops and signs visible in the backgr
ound. A black car is driving on the street, and there are some pedestrians and trees in the distance.                                              
                                                                                                                                                   
The second image shows a view of a tall tower, likely a landmark, partially obscured by branches of cherry blossom trees in full bloom. The cherry 
blossoms are pink and create a beautiful contrast against the clear blue sky. The tower is modern in design, with a circular observation deck near the top. The scene suggests a springtime setting, with the cherry blossoms indicating the blooming season.
--------------------------------------------------
PASSED

=================== 1 passed, 2 warnings in 76.27s (0:01:16) ===================

ChartQA Lmeval score

For detailed information on this command, run:
  run.py eval_vllm --model_name Kwai-Keye/Keye-VL-8B-Preview --url http://0.0.0.0:7899 --output_dir ./chartqa --eval_name chartqa - --help
================================================================================
Metrics:
{
    "explicit_prompt_relaxed_correctness": 0.8672,
    "anywhere_in_answer_relaxed_correctness": 0.868
}
================================================================================

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

gemini-code-assist

Code Review

This pull request correctly identifies and fixes a missing get_mrope_input_positions function for KeyeForConditionalGeneration by adding the function and the SupportsMRoPE interface. The changes also include a necessary ROCm-specific bugfix and refactoring of the attention backend selection, which improves code clarity. A new unit test is added to validate the fix. However, I've found a critical issue in the implementation of get_mrope_input_positions that could lead to a crash under certain conditions.

gemini-code-assist · 2025-10-31T19:23:44Z

+        if isinstance(video_grid_thw, list) and len(video_grid_thw) > 0:
+            video_grid_thw = video_grid_thw[0]


This conditional statement appears to incorrectly handle the case where video_grid_thw is a list[list[int]]. If video_grid_thw is a list of multiple video grids (e.g., [[t1, h1, w1], [t2, h2, w2]]), this line will slice it to just the first grid ([t1, h1, w1]). When this 1D list is passed to split_thw, it will be converted to a 1D tensor, causing an indexing error at grid_thw[:, 0] and crashing the execution. Since split_thw is already capable of handling a list[list[int]] by converting it to a 2D tensor, this slicing logic is both incorrect and unnecessary. Removing it will ensure correct behavior for all valid input types.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2025-10-31T19:26:15Z

+    if current_platform.is_cuda():
+        from vllm.vllm_flash_attn.layers.rotary import apply_rotary_emb
+    elif current_platform.is_rocm():
+        from flash_attn.ops.triton.rotary import apply_rotary as apply_rotary_emb

    q_embed = apply_rotary_emb(q.float(), cos.float(), sin.float()).type_as(q)
    k_embed = apply_rotary_emb(k.float(), cos.float(), sin.float()).type_as(k)


Call ROCm rotary kernel with wrong signature

In apply_rotary_pos_emb_flashatt the ROCm branch imports flash_attn.ops.triton.rotary.apply_rotary, which expects both q and k tensors and returns the rotated pair. The current implementation calls this kernel twice with only (tensor, cos, sin) just like the CUDA wrapper. On ROCm this will raise a TypeError for the missing argument and prevents rotary embeddings from being applied. The ROCm path should invoke apply_rotary(q, k, cos, sin) once and unpack the returned tensors, mirroring the existing usage in layers/rotary_embedding/common.py.

Useful? React with 👍 / 👎.

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

…tionalGeneration` (vllm-project#27895) Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

…tionalGeneration` (vllm-project#27895) (vllm-project#5) Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com> Co-authored-by: TJian <tunjian.tan@embeddedllm.com>

…tionalGeneration` (vllm-project#27895) Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

bugfix keye

b57961d

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

tjtanaa requested review from DarkLight1337 and ywang96 as code owners October 31, 2025 19:22

mergify bot added the multi-modality Related to multi-modality (#4194) label Oct 31, 2025

gemini-code-assist bot reviewed Oct 31, 2025

View reviewed changes

chatgpt-codex-connector bot reviewed Oct 31, 2025

View reviewed changes

update assertion

9dc4e23

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

DarkLight1337 approved these changes Nov 1, 2025

View reviewed changes

DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 1, 2025

DarkLight1337 merged commit e2347db into vllm-project:main Nov 1, 2025
55 checks passed

tjtanaa deleted the bugfix-keye branch November 1, 2025 05:45

Kay-Tian mentioned this pull request Nov 1, 2025

vLLM PR #27895 变更核心文件提醒 Kay-Tian/vllm#73

Closed

ZhengHongming888 pushed a commit to ZhengHongming888/vllm that referenced this pull request Nov 8, 2025

[Bugfix] [Model] Missing MRoPE function definition from `KeyeForCondi…

07b0172

…tionalGeneration` (vllm-project#27895) Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

rtourgeman pushed a commit to rtourgeman/vllm that referenced this pull request Nov 10, 2025

[Bugfix] [Model] Missing MRoPE function definition from `KeyeForCondi…

1119304

…tionalGeneration` (vllm-project#27895) Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025

[Bugfix] [Model] Missing MRoPE function definition from `KeyeForCondi…

d064f30

…tionalGeneration` (vllm-project#27895) Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugfix] [Model] Missing MRoPE function definition from `KeyeForConditionalGeneration`#27895

[Bugfix] [Model] Missing MRoPE function definition from `KeyeForConditionalGeneration`#27895
DarkLight1337 merged 2 commits intovllm-project:mainfrom
EmbeddedLLM:bugfix-keye

tjtanaa commented Oct 31, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Oct 31, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Oct 31, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		if isinstance(video_grid_thw, list) and len(video_grid_thw) > 0:
		video_grid_thw = video_grid_thw[0]

Uh oh!

Conversation

tjtanaa commented Oct 31, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tjtanaa commented Oct 31, 2025 •

edited by github-actions bot

Loading