Skip to content

[Model Runner v2] Support reload weights (sleep mode)#42673

Merged
njhill merged 3 commits into
mainfrom
wentao-mrv2-reload-weights
May 15, 2026
Merged

[Model Runner v2] Support reload weights (sleep mode)#42673
njhill merged 3 commits into
mainfrom
wentao-mrv2-reload-weights

Conversation

@yewentao256

Copy link
Copy Markdown
Member

Purpose

Part of #41286

VLLM_USE_V2_MODEL_RUNNER=1 pytest tests/basic_correctness/test_cumem.py::test_deep_sleep

Originally

(EngineCore pid=2663266) ERROR 05-14 19:13:43 [core.py:1360]   File "/home/yewentao256/vllm-source/vllm/v1/executor/uniproc_executor.py", line 93, in collective_rpc
(EngineCore pid=2663266) ERROR 05-14 19:13:43 [core.py:1360]     result = run_method(self.driver_worker, method, args, kwargs)
(EngineCore pid=2663266) ERROR 05-14 19:13:43 [core.py:1360]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=2663266) ERROR 05-14 19:13:43 [core.py:1360]   File "/home/yewentao256/vllm-source/vllm/v1/serial_utils.py", line 510, in run_method
(EngineCore pid=2663266) ERROR 05-14 19:13:43 [core.py:1360]     return func(*args, **kwargs)
(EngineCore pid=2663266) ERROR 05-14 19:13:43 [core.py:1360]            ^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=2663266) ERROR 05-14 19:13:43 [core.py:1360]   File "/home/yewentao256/vllm-source/vllm/v1/worker/gpu_worker.py", line 351, in reload_weights
(EngineCore pid=2663266) ERROR 05-14 19:13:43 [core.py:1360]     self.model_runner.reload_weights(*args, **kwargs)
(EngineCore pid=2663266) ERROR 05-14 19:13:43 [core.py:1360]     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=2663266) ERROR 05-14 19:13:43 [core.py:1360] AttributeError: 'GPUModelRunner' object has no attribute 'reload_weights'

Now

======================================== 1 passed, 17 warnings in 45.19s =========================================

Signed-off-by: yewentao256 <zhyanwentao@126.com>

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

Tip: disable this comment in your organization's Code Review settings.

@yewentao256 yewentao256 added the ready ONLY add when PR is ready to merge/full CI is needed label May 14, 2026
@mergify mergify Bot added the v1 label May 14, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a reload_weights method to the GPU model runner in the v1 worker, which delegates the reloading process to the GPUModelRunnerV1 implementation. Feedback indicates that the method should also reset the encoder and multimodal caches to ensure that stale embeddings are not used following a weight update.

Comment thread vllm/v1/worker/gpu/model_runner.py
yewentao256 and others added 2 commits May 14, 2026 16:15
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
@njhill njhill enabled auto-merge (squash) May 15, 2026 16:38
@njhill njhill merged commit 6147c70 into main May 15, 2026
71 checks passed
@njhill njhill deleted the wentao-mrv2-reload-weights branch May 15, 2026 16:41
@njhill

njhill commented May 15, 2026

Copy link
Copy Markdown
Member

@yewentao256 I didn't realize you had added these until after the PR was merged. I don't think we should change this behavior. Folks using this API would already be resetting the caches separately when needed.

        self.reset_encoder_cache()
        self.reset_mm_cache()

omerpaz95 pushed a commit to omerpaz95/vllm that referenced this pull request May 18, 2026
…2673)

Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
omerpaz95 pushed a commit to omerpaz95/vllm that referenced this pull request May 18, 2026
…2673)

Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
mfylcek pushed a commit to mfylcek/vllm that referenced this pull request May 19, 2026
…2673)

Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@njhill njhill added the v2 label May 20, 2026
h1t35h pushed a commit to h1t35h/vllm that referenced this pull request May 21, 2026
…2673)

Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Liuweixiong0118 pushed a commit to Liuweixiong0118/vllm that referenced this pull request Jun 1, 2026
…2673)

Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Liuweixiong0118 <lwx34158427@gmail.com>
mvanhorn pushed a commit to mvanhorn/vllm that referenced this pull request Jun 4, 2026
…2673)

Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
andakai pushed a commit to andakai/vllm that referenced this pull request Jun 4, 2026
…2673)

Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
knight0528 pushed a commit to knight0528/vllm that referenced this pull request Jun 8, 2026
…2673)

Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed v1 v2

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants