Skip to content

[Model Runner V2] Support stock torch compile for v2#41667

Closed
yewentao256 wants to merge 6 commits into
mainfrom
wentao-model-runner-v2-support-stock-torch-compile
Closed

[Model Runner V2] Support stock torch compile for v2#41667
yewentao256 wants to merge 6 commits into
mainfrom
wentao-model-runner-v2-support-stock-torch-compile

Conversation

@yewentao256

Copy link
Copy Markdown
Member

Purpose

Support stock torch compile for v2

Part of the #41286

Originally

FAILED

========================================== FAILURES ===========================================
__________________________________ test_stock_torch_compile ___________________________________

vllm_runner = <class 'tests.conftest.VllmRunner'>
monkeypatch = <_pytest.monkeypatch.MonkeyPatch object at 0x7fab9a12e570>

    @pytest.mark.forked
    def test_stock_torch_compile(vllm_runner, monkeypatch):
        # Disable multiprocessing so that the counter is in the same process
        monkeypatch.setenv("VLLM_ENABLE_V1_MULTIPROCESSING", "0")
    
        with (
>           compilation_counter.expect(stock_torch_compile_count=1),
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
            # loading the model causes compilation (if enabled) to happen
            vllm_runner(
                "facebook/opt-125m",
                compilation_config={"mode": CompilationMode.STOCK_TORCH_COMPILE},
                gpu_memory_utilization=0.4,
            ) as _,
        ):

tests/compile/test_config.py:164: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../.local/share/uv/python/cpython-3.12.13-linux-x86_64-gnu/lib/python3.12/contextlib.py:144: in __exit__
    next(self.gen)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = CompilationCounter(num_models_seen=0, num_graphs_seen=0, num_piecewise_graphs_seen=0, num_piecewise_capturable_graphs_...facts_loaded=0, num_aot_compiles=0, num_aot_artifacts_saved=0, num_aot_artifacts_loaded=0, stock_torch_compile_count=0)
kwargs = {'stock_torch_compile_count': 1}
old = CompilationCounter(num_models_seen=0, num_graphs_seen=0, num_piecewise_graphs_seen=0, num_piecewise_capturable_graphs_...facts_loaded=0, num_aot_compiles=0, num_aot_artifacts_saved=0, num_aot_artifacts_loaded=0, stock_torch_compile_count=0)
k = 'stock_torch_compile_count', v = 1

    @contextmanager
    def expect(self, **kwargs: Any) -> Generator[None, None, None]:
        old = self.clone()
        yield
        for k, v in kwargs.items():
>           assert getattr(self, k) - getattr(old, k) == v, (
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                f"{k} not as expected, before it is {getattr(old, k)}"
                f", after it is {getattr(self, k)}, "
                f"expected diff is {v}"
            )
E           AssertionError: stock_torch_compile_count not as expected, before it is 0, after it is 0, expected diff is 1

vllm/compilation/counter.py:51: AssertionError

FAILED tests/compile/test_config.py::test_stock_torch_compile - AssertionError: stock_torch_compile_count not as expected, before it is 0, after it is 0, ...
======================= 1 failed, 38 deselected, 22 warnings in 13.08s ========================

Now

======================== 1 passed, 38 deselected, 22 warnings in 9.56s ========================

Signed-off-by: yewentao256 <zhyanwentao@126.com>

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

Tip: disable this comment in your organization's Code Review settings.

@yewentao256 yewentao256 added the ready ONLY add when PR is ready to merge/full CI is needed label May 4, 2026
@mergify

mergify Bot commented May 4, 2026

Copy link
Copy Markdown
Contributor

Hi @yewentao256, the pre-commit checks have failed. Please run:

uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?
mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

@mergify mergify Bot added the v1 label May 4, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for STOCK_TORCH_COMPILE within the GPU model runner, including an environment patch and model compilation logic. A critical issue was identified where the compilation call is incorrectly implemented as an in-place method on the model rather than using torch.compile and capturing the returned optimized module. Additionally, it is recommended to move the compilation block earlier in the initialization sequence so that dependent components utilize the compiled version of the model.

Comment thread vllm/v1/worker/gpu/model_runner.py
@njhill

njhill commented May 4, 2026

Copy link
Copy Markdown
Member

Would like to get @WoosukKwon's opinion on this one

@njhill

njhill commented May 7, 2026

Copy link
Copy Markdown
Member

@yewentao256 test failures are related - looks like some test mocks need updating

Signed-off-by: yewentao256 <zhyanwentao@126.com>

@yewentao256 yewentao256 left a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@njhill Thanks! FIxed

@njhill

njhill commented May 12, 2026

Copy link
Copy Markdown
Member

From discussion with @WoosukKwon we may not add this to MRV2 for now. @yewentao256 has update the Oracle logic to reflect this.

@njhill njhill added the v2 label May 20, 2026
@yewentao256

Copy link
Copy Markdown
Member Author

Close this PR as we are not going to support stock torch compile for v2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed v1 v2

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants