Skip to content

[Model Runner v2] Oracle for model runner v2 - qwen3 dense model by default [1/N]#39337

Merged
njhill merged 55 commits into
mainfrom
wentao-oracle-model-runner-v2
May 14, 2026
Merged

[Model Runner v2] Oracle for model runner v2 - qwen3 dense model by default [1/N]#39337
njhill merged 55 commits into
mainfrom
wentao-oracle-model-runner-v2

Conversation

@yewentao256

@yewentao256 yewentao256 commented Apr 8, 2026

Copy link
Copy Markdown
Member

Purpose

Oracle for model runner v2 - dense model by default

Now the env function:

  • Not set: using our oracle
  • set to 1: force v2
  • set to 0: force v1

We are testing "Qwen/Qwen3-0.6B" and "facebook/opt-125m" since they cover the most current v1 unit test.

Should land after #39353

Test

Covered in unit test

Signed-off-by: yewentao256 <zhyanwentao@126.com>

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the V2 model runner configuration by centralizing its logic within the VllmConfig class. It introduces a whitelist for models that should default to the V2 runner and updates the environment variable handling to support a tri-state (None, True, False) configuration. Consequently, direct references to the environment variable across the codebase have been replaced with the new vllm_config.use_v2_model_runner property. I have no further feedback to provide.

Signed-off-by: yewentao256 <zhyanwentao@126.com>
@mergify

mergify Bot commented Apr 8, 2026

Copy link
Copy Markdown
Contributor

Hi @yewentao256, the pre-commit checks have failed. Please run:

uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?
mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

Comment thread vllm/config/vllm.py
Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>

@njhill njhill left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @yewentao256!

@github-project-automation github-project-automation Bot moved this to Ready in NVIDIA May 14, 2026
@njhill njhill enabled auto-merge (squash) May 14, 2026 16:37
@njhill njhill merged commit ae4f59f into main May 14, 2026
94 checks passed
@njhill njhill deleted the wentao-oracle-model-runner-v2 branch May 14, 2026 17:02
@github-project-automation github-project-automation Bot moved this from Ready to Done in NVIDIA May 14, 2026
vrdn-23 added a commit to vrdn-23/vllm that referenced this pull request May 15, 2026
Resolve conflicts in vllm/envs.py by taking the pydantic-refactor side
wholesale: the two conflict regions are the legacy TYPE_CHECKING
annotations block and the legacy environment_variables dict, both
already superseded by the BaseSettings models on this branch.

Port the only semantic main-side envs.py change since the previous
in-branch merge of main (PR vllm-project#39337, ae4f59f): VLLM_USE_V2_MODEL_RUNNER
becomes tri-state. CompilationSettings.use_v2_model_runner is now
bool | None defaulting to None, with a before-validator that maps
"1" -> True, "0" -> False, anything else -> None, mirroring
maybe_convert_bool from main. Consumers in vllm/config/vllm.py already
handle the None case.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@NickLucche

Copy link
Copy Markdown
Member

In hindsight we might have wanted a #ready-run-all-tests label here

@njhill

njhill commented May 15, 2026

Copy link
Copy Markdown
Member

In hindsight we might have wanted a #ready-run-all-tests label here

I did set it and the tests had passed, but we should have added it again after the final changes before merging. I'm not sure why it didn't catch the P/D issue though.

omerpaz95 pushed a commit to omerpaz95/vllm that referenced this pull request May 18, 2026
…efault [1/N] (vllm-project#39337)

Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Co-authored-by: Nick Hill <nickhill123@gmail.com>
omerpaz95 pushed a commit to omerpaz95/vllm that referenced this pull request May 18, 2026
…efault [1/N] (vllm-project#39337)

Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Co-authored-by: Nick Hill <nickhill123@gmail.com>
vrdn-23 added a commit to vrdn-23/vllm that referenced this pull request May 18, 2026
Resolves conflicts in vllm/envs.py introduced by the pydantic-settings
refactor. The legacy `if TYPE_CHECKING:` declaration block and
`environment_variables: dict[str, Callable]` runtime dict were dropped
wholesale — both are already superseded by the pydantic BaseSettings
model tree on this branch.

Main-side commits touching vllm/envs.py since the merge base
(256dbca..origin/main) and how each was ported:

- ae4f59f (vllm-project#39337) — VLLM_USE_V2_MODEL_RUNNER widened from `bool`
  (default False) to `bool | None` (default None). Already present on
  the branch as `use_v2_model_runner` on CompilationSettings with a
  `_parse_use_v2_model_runner` field_validator. Tri-state: unset means
  "use config default".

- 8a56da3 (vllm-project#42304) — adds VLLM_USE_BREAKABLE_CUDAGRAPH. Ported as
  `use_breakable_cudagraph: bool = False` on CompilationSettings.

- 36e74c9 (vllm-project#42689) — adds four KV-connector env vars. Ported on
  ConnectorSettings as:
    - mooncake_store_tier_log: bool = False
    - mooncake_disk_staging_usable_ratio: float = 0.9
    - preferred_segment: str | None (alias=MOONCAKE_PREFERRED_SEGMENT)
    - requester_local_hostname: str | None
      (alias=MOONCAKE_REQUESTER_LOCAL_HOSTNAME)
  The last two use `alias=` because they lack the VLLM_ prefix.

Verification:
- grep -n "<<<<<<< |>>>>>>> |=======" vllm/envs.py returns zero hits.
- pre-commit run --files vllm/envs.py passes (ruff, mypy, SPDX, the
  schema validator that enforces every field has a default and a
  docstring, etc.).
- Manual override test confirmed pydantic parses both VLLM_-prefixed
  and unprefixed env vars correctly via the registry.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
mfylcek pushed a commit to mfylcek/vllm that referenced this pull request May 19, 2026
…efault [1/N] (vllm-project#39337)

Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Co-authored-by: Nick Hill <nickhill123@gmail.com>
jhu960213 pushed a commit to jhu960213/vllm that referenced this pull request May 20, 2026
…efault [1/N] (vllm-project#39337)

Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Co-authored-by: Nick Hill <nickhill123@gmail.com>
@njhill njhill added the v2 label May 20, 2026
h1t35h pushed a commit to h1t35h/vllm that referenced this pull request May 21, 2026
…efault [1/N] (vllm-project#39337)

Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Co-authored-by: Nick Hill <nickhill123@gmail.com>
Liuweixiong0118 pushed a commit to Liuweixiong0118/vllm that referenced this pull request Jun 1, 2026
…efault [1/N] (vllm-project#39337)

Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Co-authored-by: Nick Hill <nickhill123@gmail.com>
Signed-off-by: Liuweixiong0118 <lwx34158427@gmail.com>
mvanhorn pushed a commit to mvanhorn/vllm that referenced this pull request Jun 4, 2026
…efault [1/N] (vllm-project#39337)

Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Co-authored-by: Nick Hill <nickhill123@gmail.com>
Signed-off-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
andakai pushed a commit to andakai/vllm that referenced this pull request Jun 4, 2026
…efault [1/N] (vllm-project#39337)

Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Co-authored-by: Nick Hill <nickhill123@gmail.com>
knight0528 pushed a commit to knight0528/vllm that referenced this pull request Jun 8, 2026
…efault [1/N] (vllm-project#39337)

Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: Nick Hill <nickhill123@gmail.com>
Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Co-authored-by: Nick Hill <nickhill123@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kv-connector nvidia qwen Related to Qwen models ready ONLY add when PR is ready to merge/full CI is needed structured-output v1 v2

Projects

Status: Done
Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants