[Model Runner v2] Oracle for model runner v2 - qwen3 dense model by default [1/N] by yewentao256 · Pull Request #39337 · vllm-project/vllm

yewentao256 · 2026-04-08T19:35:06Z

Purpose

Oracle for model runner v2 - dense model by default

Now the env function:

Not set: using our oracle
set to 1: force v2
set to 0: force v1

We are testing "Qwen/Qwen3-0.6B" and "facebook/opt-125m" since they cover the most current v1 unit test.

Should land after #39353

Test

Covered in unit test

Signed-off-by: yewentao256 <zhyanwentao@126.com>

gemini-code-assist

Code Review

This pull request refactors the V2 model runner configuration by centralizing its logic within the VllmConfig class. It introduces a whitelist for models that should default to the V2 runner and updates the environment variable handling to support a tri-state (None, True, False) configuration. Consequently, direct references to the environment variable across the codebase have been replaced with the new vllm_config.use_v2_model_runner property. I have no further feedback to provide.

Signed-off-by: yewentao256 <zhyanwentao@126.com>

mergify · 2026-04-08T20:42:49Z

Hi @yewentao256, the pre-commit checks have failed. Please run:

uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?

mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

Signed-off-by: yewentao256 <zhyanwentao@126.com>

njhill

Thanks @yewentao256!

Resolve conflicts in vllm/envs.py by taking the pydantic-refactor side wholesale: the two conflict regions are the legacy TYPE_CHECKING annotations block and the legacy environment_variables dict, both already superseded by the BaseSettings models on this branch. Port the only semantic main-side envs.py change since the previous in-branch merge of main (PR vllm-project#39337, ae4f59f): VLLM_USE_V2_MODEL_RUNNER becomes tri-state. CompilationSettings.use_v2_model_runner is now bool | None defaulting to None, with a before-validator that maps "1" -> True, "0" -> False, anything else -> None, mirroring maybe_convert_bool from main. Consumers in vllm/config/vllm.py already handle the None case. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

NickLucche · 2026-05-15T15:49:23Z

In hindsight we might have wanted a #ready-run-all-tests label here

njhill · 2026-05-15T16:31:35Z

In hindsight we might have wanted a #ready-run-all-tests label here

I did set it and the tests had passed, but we should have added it again after the final changes before merging. I'm not sure why it didn't catch the P/D issue though.

…efault [1/N] (vllm-project#39337) Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: Nick Hill <nickhill123@gmail.com> Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com> Co-authored-by: Nick Hill <nickhill123@gmail.com>

Resolves conflicts in vllm/envs.py introduced by the pydantic-settings refactor. The legacy `if TYPE_CHECKING:` declaration block and `environment_variables: dict[str, Callable]` runtime dict were dropped wholesale — both are already superseded by the pydantic BaseSettings model tree on this branch. Main-side commits touching vllm/envs.py since the merge base (256dbca..origin/main) and how each was ported: - ae4f59f (vllm-project#39337) — VLLM_USE_V2_MODEL_RUNNER widened from `bool` (default False) to `bool | None` (default None). Already present on the branch as `use_v2_model_runner` on CompilationSettings with a `_parse_use_v2_model_runner` field_validator. Tri-state: unset means "use config default". - 8a56da3 (vllm-project#42304) — adds VLLM_USE_BREAKABLE_CUDAGRAPH. Ported as `use_breakable_cudagraph: bool = False` on CompilationSettings. - 36e74c9 (vllm-project#42689) — adds four KV-connector env vars. Ported on ConnectorSettings as: - mooncake_store_tier_log: bool = False - mooncake_disk_staging_usable_ratio: float = 0.9 - preferred_segment: str | None (alias=MOONCAKE_PREFERRED_SEGMENT) - requester_local_hostname: str | None (alias=MOONCAKE_REQUESTER_LOCAL_HOSTNAME) The last two use `alias=` because they lack the VLLM_ prefix. Verification: - grep -n "<<<<<<< |>>>>>>> |=======" vllm/envs.py returns zero hits. - pre-commit run --files vllm/envs.py passes (ruff, mypy, SPDX, the schema validator that enforces every field has a default and a docstring, etc.). - Manual override test confirmed pydantic parses both VLLM_-prefixed and unprefixed env vars correctly via the registry. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…efault [1/N] (vllm-project#39337) Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: Nick Hill <nickhill123@gmail.com> Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com> Co-authored-by: Nick Hill <nickhill123@gmail.com>

…efault [1/N] (vllm-project#39337) Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: Nick Hill <nickhill123@gmail.com> Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com> Co-authored-by: Nick Hill <nickhill123@gmail.com> Signed-off-by: Liuweixiong0118 <lwx34158427@gmail.com>

…efault [1/N] (vllm-project#39337) Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: Nick Hill <nickhill123@gmail.com> Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com> Co-authored-by: Nick Hill <nickhill123@gmail.com> Signed-off-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>

…efault [1/N] (vllm-project#39337) Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: Nick Hill <nickhill123@gmail.com> Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com> Co-authored-by: Nick Hill <nickhill123@gmail.com>

oracle for model runner v2

38b4eee

Signed-off-by: yewentao256 <zhyanwentao@126.com>

yewentao256 requested review from ApostaC, ProExpertProg, WoosukKwon, alexm-redhat, heheda12345, hmellor, houseroad, mgoin, njhill, orozery, pavanimajety, robertgshaw2-redhat, tlrmchlsmth, vadiklyutiy, youkaichao and ywang96 as code owners April 8, 2026 19:35

yewentao256 added ready ONLY add when PR is ready to merge/full CI is needed qwen Related to Qwen models labels Apr 8, 2026

mergify Bot added nvidia v1 labels Apr 8, 2026

github-project-automation Bot added this to NVIDIA Apr 8, 2026

gemini-code-assist Bot reviewed Apr 8, 2026

View reviewed changes

discard opt 125m

28c06c9

Signed-off-by: yewentao256 <zhyanwentao@126.com>

njhill reviewed Apr 9, 2026

View reviewed changes

Comment thread vllm/config/vllm.py

yewentao256 added 2 commits April 9, 2026 14:52

patch ngram

cd4f824

Signed-off-by: yewentao256 <zhyanwentao@126.com>

Merge branch 'main' into wentao-oracle-model-runner-v2

1b0202a

yewentao256 requested review from aarnphm and russellb as code owners April 9, 2026 14:52

yewentao256 added 2 commits May 14, 2026 12:52

update

83c6e90

Signed-off-by: yewentao256 <zhyanwentao@126.com>

Merge branch 'main' into wentao-oracle-model-runner-v2

0579be8

Signed-off-by: yewentao256 <zhyanwentao@126.com>

njhill approved these changes May 14, 2026

View reviewed changes

github-project-automation Bot moved this to Ready in NVIDIA May 14, 2026

njhill enabled auto-merge (squash) May 14, 2026 16:37

njhill merged commit ae4f59f into main May 14, 2026
94 checks passed

njhill deleted the wentao-oracle-model-runner-v2 branch May 14, 2026 17:02

github-project-automation Bot moved this from Ready to Done in NVIDIA May 14, 2026

github-project-automation Bot moved this to Done in Structured Output May 14, 2026

yewentao256 mentioned this pull request May 14, 2026

[Model Runner V2] Migration from v1 to v2, with more Llama and Mistral dense models [2/N] #42665

Closed

vllm-agent mentioned this pull request May 15, 2026

Revert "[Model Runner v2] Oracle for model runner v2 - qwen3 dense model by default [1/N]" (#39337) #42698

Closed

mgoin mentioned this pull request May 15, 2026

[Model Runner v2] Support update_config #42783

Merged

chfeng-cs mentioned this pull request May 16, 2026

[Bug][CI] NIXL + FlashInfer fails with Qwen3 MRV2 and --block-size 128 #42846

Closed

1 task

NickLucche mentioned this pull request May 18, 2026

[MRv2] Default to MRv1 when a connector is present #42955

Merged

shanjiaz mentioned this pull request May 18, 2026

Add parallel drafting to v2 model runner unsupported features #43010

Merged

4 tasks

haosdent mentioned this pull request May 19, 2026

[CI] Disable V2 model runner for LoRA configs #43084

Closed

njhill added the v2 label May 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Model Runner v2] Oracle for model runner v2 - qwen3 dense model by default [1/N]#39337

[Model Runner v2] Oracle for model runner v2 - qwen3 dense model by default [1/N]#39337
njhill merged 55 commits into
mainfrom
wentao-oracle-model-runner-v2

yewentao256 commented Apr 8, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

mergify Bot commented Apr 8, 2026

Uh oh!

Uh oh!

njhill left a comment

Uh oh!

Uh oh!

NickLucche commented May 15, 2026

Uh oh!

njhill commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

yewentao256 commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

mergify Bot commented Apr 8, 2026

Uh oh!

Uh oh!

njhill left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

NickLucche commented May 15, 2026

Uh oh!

njhill commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

yewentao256 commented Apr 8, 2026 •

edited

Loading