[Model Runner v2] Oracle for model runner v2 - qwen3 dense model by default [1/N]#39337
Conversation
Signed-off-by: yewentao256 <zhyanwentao@126.com>
There was a problem hiding this comment.
Code Review
This pull request refactors the V2 model runner configuration by centralizing its logic within the VllmConfig class. It introduces a whitelist for models that should default to the V2 runner and updates the environment variable handling to support a tri-state (None, True, False) configuration. Consequently, direct references to the environment variable across the codebase have been replaced with the new vllm_config.use_v2_model_runner property. I have no further feedback to provide.
Signed-off-by: yewentao256 <zhyanwentao@126.com>
|
Hi @yewentao256, the pre-commit checks have failed. Please run: uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
Resolve conflicts in vllm/envs.py by taking the pydantic-refactor side wholesale: the two conflict regions are the legacy TYPE_CHECKING annotations block and the legacy environment_variables dict, both already superseded by the BaseSettings models on this branch. Port the only semantic main-side envs.py change since the previous in-branch merge of main (PR vllm-project#39337, ae4f59f): VLLM_USE_V2_MODEL_RUNNER becomes tri-state. CompilationSettings.use_v2_model_runner is now bool | None defaulting to None, with a before-validator that maps "1" -> True, "0" -> False, anything else -> None, mirroring maybe_convert_bool from main. Consumers in vllm/config/vllm.py already handle the None case. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
In hindsight we might have wanted a #ready-run-all-tests label here |
I did set it and the tests had passed, but we should have added it again after the final changes before merging. I'm not sure why it didn't catch the P/D issue though. |
…efault [1/N] (vllm-project#39337) Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: Nick Hill <nickhill123@gmail.com> Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com> Co-authored-by: Nick Hill <nickhill123@gmail.com>
…efault [1/N] (vllm-project#39337) Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: Nick Hill <nickhill123@gmail.com> Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com> Co-authored-by: Nick Hill <nickhill123@gmail.com>
Resolves conflicts in vllm/envs.py introduced by the pydantic-settings refactor. The legacy `if TYPE_CHECKING:` declaration block and `environment_variables: dict[str, Callable]` runtime dict were dropped wholesale — both are already superseded by the pydantic BaseSettings model tree on this branch. Main-side commits touching vllm/envs.py since the merge base (256dbca..origin/main) and how each was ported: - ae4f59f (vllm-project#39337) — VLLM_USE_V2_MODEL_RUNNER widened from `bool` (default False) to `bool | None` (default None). Already present on the branch as `use_v2_model_runner` on CompilationSettings with a `_parse_use_v2_model_runner` field_validator. Tri-state: unset means "use config default". - 8a56da3 (vllm-project#42304) — adds VLLM_USE_BREAKABLE_CUDAGRAPH. Ported as `use_breakable_cudagraph: bool = False` on CompilationSettings. - 36e74c9 (vllm-project#42689) — adds four KV-connector env vars. Ported on ConnectorSettings as: - mooncake_store_tier_log: bool = False - mooncake_disk_staging_usable_ratio: float = 0.9 - preferred_segment: str | None (alias=MOONCAKE_PREFERRED_SEGMENT) - requester_local_hostname: str | None (alias=MOONCAKE_REQUESTER_LOCAL_HOSTNAME) The last two use `alias=` because they lack the VLLM_ prefix. Verification: - grep -n "<<<<<<< |>>>>>>> |=======" vllm/envs.py returns zero hits. - pre-commit run --files vllm/envs.py passes (ruff, mypy, SPDX, the schema validator that enforces every field has a default and a docstring, etc.). - Manual override test confirmed pydantic parses both VLLM_-prefixed and unprefixed env vars correctly via the registry. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…efault [1/N] (vllm-project#39337) Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: Nick Hill <nickhill123@gmail.com> Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com> Co-authored-by: Nick Hill <nickhill123@gmail.com>
…efault [1/N] (vllm-project#39337) Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: Nick Hill <nickhill123@gmail.com> Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com> Co-authored-by: Nick Hill <nickhill123@gmail.com>
…efault [1/N] (vllm-project#39337) Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: Nick Hill <nickhill123@gmail.com> Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com> Co-authored-by: Nick Hill <nickhill123@gmail.com>
…efault [1/N] (vllm-project#39337) Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: Nick Hill <nickhill123@gmail.com> Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com> Co-authored-by: Nick Hill <nickhill123@gmail.com> Signed-off-by: Liuweixiong0118 <lwx34158427@gmail.com>
…efault [1/N] (vllm-project#39337) Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: Nick Hill <nickhill123@gmail.com> Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com> Co-authored-by: Nick Hill <nickhill123@gmail.com> Signed-off-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
…efault [1/N] (vllm-project#39337) Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: Nick Hill <nickhill123@gmail.com> Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com> Co-authored-by: Nick Hill <nickhill123@gmail.com>
…efault [1/N] (vllm-project#39337) Signed-off-by: yewentao256 <zhyanwentao@126.com> Signed-off-by: Nick Hill <nickhill123@gmail.com> Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com> Co-authored-by: Nick Hill <nickhill123@gmail.com>
Purpose
Oracle for model runner v2 - dense model by default
Now the env function:
We are testing "Qwen/Qwen3-0.6B" and "facebook/opt-125m" since they cover the most current v1 unit test.
Should land after #39353
Test
Covered in unit test