[Model Runner V2] Support stock torch compile for v2 by yewentao256 · Pull Request #41667 · vllm-project/vllm

yewentao256 · 2026-05-04T21:37:26Z

Purpose

Support stock torch compile for v2

Part of the #41286

Originally

FAILED

========================================== FAILURES ===========================================
__________________________________ test_stock_torch_compile ___________________________________

vllm_runner = <class 'tests.conftest.VllmRunner'>
monkeypatch = <_pytest.monkeypatch.MonkeyPatch object at 0x7fab9a12e570>

    @pytest.mark.forked
    def test_stock_torch_compile(vllm_runner, monkeypatch):
        # Disable multiprocessing so that the counter is in the same process
        monkeypatch.setenv("VLLM_ENABLE_V1_MULTIPROCESSING", "0")
    
        with (
>           compilation_counter.expect(stock_torch_compile_count=1),
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
            # loading the model causes compilation (if enabled) to happen
            vllm_runner(
                "facebook/opt-125m",
                compilation_config={"mode": CompilationMode.STOCK_TORCH_COMPILE},
                gpu_memory_utilization=0.4,
            ) as _,
        ):

tests/compile/test_config.py:164: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../.local/share/uv/python/cpython-3.12.13-linux-x86_64-gnu/lib/python3.12/contextlib.py:144: in __exit__
    next(self.gen)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = CompilationCounter(num_models_seen=0, num_graphs_seen=0, num_piecewise_graphs_seen=0, num_piecewise_capturable_graphs_...facts_loaded=0, num_aot_compiles=0, num_aot_artifacts_saved=0, num_aot_artifacts_loaded=0, stock_torch_compile_count=0)
kwargs = {'stock_torch_compile_count': 1}
old = CompilationCounter(num_models_seen=0, num_graphs_seen=0, num_piecewise_graphs_seen=0, num_piecewise_capturable_graphs_...facts_loaded=0, num_aot_compiles=0, num_aot_artifacts_saved=0, num_aot_artifacts_loaded=0, stock_torch_compile_count=0)
k = 'stock_torch_compile_count', v = 1

    @contextmanager
    def expect(self, **kwargs: Any) -> Generator[None, None, None]:
        old = self.clone()
        yield
        for k, v in kwargs.items():
>           assert getattr(self, k) - getattr(old, k) == v, (
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                f"{k} not as expected, before it is {getattr(old, k)}"
                f", after it is {getattr(self, k)}, "
                f"expected diff is {v}"
            )
E           AssertionError: stock_torch_compile_count not as expected, before it is 0, after it is 0, expected diff is 1

vllm/compilation/counter.py:51: AssertionError

FAILED tests/compile/test_config.py::test_stock_torch_compile - AssertionError: stock_torch_compile_count not as expected, before it is 0, after it is 0, ...
======================= 1 failed, 38 deselected, 22 warnings in 13.08s ========================

Now

======================== 1 passed, 38 deselected, 22 warnings in 9.56s ========================

Signed-off-by: yewentao256 <zhyanwentao@126.com>

claude

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

_{Tip: disable this comment in your organization's Code Review settings.}

mergify · 2026-05-04T21:38:03Z

Hi @yewentao256, the pre-commit checks have failed. Please run:

uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?

mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

gemini-code-assist

Code Review

This pull request introduces support for STOCK_TORCH_COMPILE within the GPU model runner, including an environment patch and model compilation logic. A critical issue was identified where the compilation call is incorrectly implemented as an in-place method on the model rather than using torch.compile and capturing the returned optimized module. Additionally, it is recommended to move the compilation block earlier in the initialization sequence so that dependent components utilize the compiled version of the model.

njhill · 2026-05-04T23:44:17Z

Would like to get @WoosukKwon's opinion on this one

…ompile

njhill · 2026-05-07T16:15:05Z

@yewentao256 test failures are related - looks like some test mocks need updating

Signed-off-by: yewentao256 <zhyanwentao@126.com>

yewentao256

@njhill Thanks! FIxed

…ompile

njhill · 2026-05-12T14:26:26Z

From discussion with @WoosukKwon we may not add this to MRV2 for now. @yewentao256 has update the Oracle logic to reflect this.

yewentao256 · 2026-05-21T19:59:22Z

Close this PR as we are not going to support stock torch compile for v2

model runner v2 support stock torch compile

d5faa25

Signed-off-by: yewentao256 <zhyanwentao@126.com>

yewentao256 requested review from WoosukKwon and njhill as code owners May 4, 2026 21:37

claude Bot reviewed May 4, 2026

View reviewed changes

yewentao256 added the ready ONLY add when PR is ready to merge/full CI is needed label May 4, 2026

yewentao256 mentioned this pull request May 4, 2026

[Feature]: Migration from Model Runner v1 to Model Runner v2 #41286

Open

30 tasks

mergify Bot added the v1 label May 4, 2026

gemini-code-assist Bot reviewed May 4, 2026

View reviewed changes

Comment thread vllm/v1/worker/gpu/model_runner.py

yewentao256 added 2 commits May 5, 2026 09:10

Merge branch 'main' into wentao-model-runner-v2-support-stock-torch-c…

cda7809

…ompile

Merge branch 'main' into wentao-model-runner-v2-support-stock-torch-c…

64a2838

…ompile

fix CI

2e75125

Signed-off-by: yewentao256 <zhyanwentao@126.com>

yewentao256 commented May 7, 2026

View reviewed changes

yewentao256 added 2 commits May 7, 2026 15:44

Merge branch 'main' into wentao-model-runner-v2-support-stock-torch-c…

f5c87e9

…ompile

Merge branch 'main' into wentao-model-runner-v2-support-stock-torch-c…

3034094

…ompile

njhill added the v2 label May 20, 2026

yewentao256 closed this May 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Model Runner V2] Support stock torch compile for v2#41667

[Model Runner V2] Support stock torch compile for v2#41667
yewentao256 wants to merge 6 commits into
mainfrom
wentao-model-runner-v2-support-stock-torch-compile

yewentao256 commented May 4, 2026

Uh oh!

claude Bot left a comment

Uh oh!

mergify Bot commented May 4, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

njhill commented May 4, 2026

Uh oh!

njhill commented May 7, 2026

Uh oh!

yewentao256 left a comment

Uh oh!

njhill commented May 12, 2026

Uh oh!

yewentao256 commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

yewentao256 commented May 4, 2026

Purpose

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

mergify Bot commented May 4, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

njhill commented May 4, 2026

Uh oh!

njhill commented May 7, 2026

Uh oh!

yewentao256 left a comment

Choose a reason for hiding this comment

Uh oh!

njhill commented May 12, 2026

Uh oh!

yewentao256 commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants