Skip to content

[Bugfix] Fix Qwen3-VL timestamp mismatch when using num_frames without fps#36136

Merged
vllm-bot merged 1 commit into
vllm-project:mainfrom
weiguangli-io:codex/vllm-35909-fix-num-frames-timestamps
Mar 11, 2026
Merged

[Bugfix] Fix Qwen3-VL timestamp mismatch when using num_frames without fps#36136
vllm-bot merged 1 commit into
vllm-project:mainfrom
weiguangli-io:codex/vllm-35909-fix-num-frames-timestamps

Conversation

@weiguangli-io

Copy link
Copy Markdown
Contributor

Summary

Fixes #35909.

When num_frames is provided via mm_processor_kwargs without fps, _get_video_second_idx was falling back to the default fps (2) to compute the number of frames for timestamp calculation. However, HF's Qwen3VLVideoProcessor.sample_frames treats num_frames and fps as mutually exclusive — when num_frames is provided, it uses it directly. This caused a mismatch between the computed timestamps length and the actual video_grid_thw frame count, triggering:

AssertionError: The timestamps length(10) should be equal video length (16).

Changes:

  • Add sampled_num_frames parameter to _get_video_second_idx in Qwen3VLProcessingInfo
  • When num_frames is explicitly provided, use it directly instead of computing from fps, mirroring HF's behavior
  • Pass num_frames from mm_processor_kwargs through to _get_video_second_idx in _call_hf_processor

This fix also applies to Qwen3.5-VL which inherits Qwen3VLProcessingInfo.

Test plan

  • Verify with the reproduction script from the issue: pass mm_processor_kwargs={"num_frames": 32, "do_sample_frames": True} to llm.chat() with a video input on Qwen3-VL
  • Verify that existing fps-based sampling still works correctly (no num_frames provided)
  • Verify that passing both num_frames and fps raises an error in HF processor (existing behavior, unchanged)

@mergify mergify Bot added qwen Related to Qwen models bug Something isn't working labels Mar 5, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a bug in Qwen3-VL and Qwen3.5-VL related to a timestamp mismatch when num_frames is specified without fps. The changes ensure that if num_frames is provided, it is used directly for timestamp calculations, which aligns with the behavior of the underlying Hugging Face processor and resolves the AssertionError. The implementation is clean and directly fixes the issue by passing num_frames through to the timestamp generation logic. The changes look correct and I have no further feedback.

@weiguangli-io weiguangli-io force-pushed the codex/vllm-35909-fix-num-frames-timestamps branch from 5add9ee to 896d89b Compare March 5, 2026 13:23
@DarkLight1337 DarkLight1337 requested a review from Isotr0py March 5, 2026 16:40

@Isotr0py Isotr0py left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, can you add a processor test under tests/models/multimodal/processing to avoid regression?

@mergify mergify Bot added the multi-modality Related to multi-modality (#4194) label Mar 6, 2026
@Isotr0py Isotr0py enabled auto-merge (squash) March 6, 2026 04:33
@github-actions github-actions Bot added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 6, 2026
@DarkLight1337

Copy link
Copy Markdown
Member

auto-merge was automatically disabled March 6, 2026 09:59

Head branch was pushed to by a user without write access

@weiguangli-io weiguangli-io force-pushed the codex/vllm-35909-fix-num-frames-timestamps branch 3 times, most recently from 58b6d2f to 14d7a3d Compare March 7, 2026 04:11
@weiguangli-io

Copy link
Copy Markdown
Contributor Author

Hi @DarkLight1337, I checked the failing CI — the multi-modal-processor-test-cpu failure appears to be related to my changes. I'll investigate the Buildkite log and push a fix. The AMD failures look like unrelated flakes. Thanks for the heads up!

@mergify

mergify Bot commented Mar 8, 2026

Copy link
Copy Markdown
Contributor

Hi @OiPunk, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?
mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

@weiguangli-io weiguangli-io force-pushed the codex/vllm-35909-fix-num-frames-timestamps branch from 00b007f to b7ba987 Compare March 8, 2026 11:43
@weiguangli-io

Copy link
Copy Markdown
Contributor Author

CI Failure Analysis

Failing check: language-models-tests-extra-standard-1 (Buildkite build #55133)

Verdict: Unrelated flaky test — no code changes needed.

Evidence:

  1. Not related to PR changes — This test runs pytest -v -s models/language -m 'core_model and slow_test', testing language models. This PR only modifies vllm/model_executor/models/qwen3_vl.py (a multimodal model) and adds a test in tests/models/multimodal/processing/test_qwen3_vl.py. The test is triggered because source_file_dependencies includes vllm/model_executor/models/, but the actual test content has no relation to Qwen3-VL.
  2. Known flaky test — Issue [CI Failure]: mi325_8: Language Models Tests (Extra Standard) %N #29458 documents previous CI failures for the same Language Models Tests (Extra Standard) %N suite.
  3. Main branch also failing — The latest nightly full CI build on main (build #55120) also has 2 failures, confirming pre-existing instability.
  4. Relevant tests pass — The multi-modal-processor (GPU) test, which includes processor tests for Qwen3-VL, has already passed. The multi-modal-processor-test-cpu is still running.

@weiguangli-io

Copy link
Copy Markdown
Contributor Author

@DarkLight1337 I've investigated the failing CI. The language-models-tests-extra-standard-1 failure is unrelated to my changes (detailed analysis in my earlier comment). The relevant multi-modal-processor tests have all passed. Could you please merge this or trigger a re-run? Thank you!

…t fps

When `num_frames` is provided via `mm_processor_kwargs` without `fps`,
`_get_video_second_idx` was falling back to the default fps (2) to
compute the number of frames for timestamp calculation. However, the HF
processor uses `num_frames` directly (since `num_frames` and `fps` are
mutually exclusive in `Qwen3VLVideoProcessor.sample_frames`). This
caused a mismatch between the computed timestamps length and the actual
`video_grid_thw` frame count, triggering an assertion error.

Pass `num_frames` through to `_get_video_second_idx` and use it directly
when provided, mirroring the HF processor's behavior. Also explicitly
set `fps=None` when `num_frames` is specified to prevent HF's
`BaseVideoProcessor.preprocess()` from filling in the class default
(`fps=2`) via `setdefault()`.

Also adds processor regression tests for Qwen3-VL num_frames timestamp
handling, with video metadata aligned to ndarray bounds.

Fixes vllm-project#35909

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: OiPunk <codingpunk@gmail.com>
@weiguangli-io weiguangli-io force-pushed the codex/vllm-35909-fix-num-frames-timestamps branch from b7ba987 to df3f36d Compare March 10, 2026 12:49
@vllm-bot vllm-bot merged commit 7247596 into vllm-project:main Mar 11, 2026
59 of 63 checks passed
wendyliu235 pushed a commit to wendyliu235/vllm-public that referenced this pull request Mar 18, 2026
…t fps (vllm-project#36136)

Signed-off-by: OiPunk <codingpunk@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
khairulkabir1661 pushed a commit to khairulkabir1661/vllm that referenced this pull request Mar 27, 2026
…t fps (vllm-project#36136)

Signed-off-by: OiPunk <codingpunk@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
mtparet pushed a commit to blackfuel-ai/vllm that referenced this pull request Apr 9, 2026
…t fps (vllm-project#36136)

Signed-off-by: OiPunk <codingpunk@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
mystous pushed a commit to mystous/vllm_hybrid that referenced this pull request May 10, 2026
…t fps (vllm-project#36136)

Signed-off-by: OiPunk <codingpunk@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
my-other-github-account pushed a commit to my-other-github-account/vllm that referenced this pull request May 15, 2026
…t fps (vllm-project#36136)

Signed-off-by: OiPunk <codingpunk@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
my-other-github-account pushed a commit to my-other-github-account/vllm that referenced this pull request May 15, 2026
…t fps (vllm-project#36136)

Signed-off-by: OiPunk <codingpunk@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
0826joyce pushed a commit to 0826joyce/vllm-serving-optimization that referenced this pull request May 19, 2026
…t fps (vllm-project#36136)

Signed-off-by: OiPunk <codingpunk@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
mvanhorn pushed a commit to mvanhorn/vllm that referenced this pull request Jun 4, 2026
…t fps (vllm-project#36136)

Signed-off-by: OiPunk <codingpunk@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working multi-modality Related to multi-modality (#4194) qwen Related to Qwen models ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Error when using Qwen3-VL/Qwen3.5 with video input and num_frames

4 participants