Skip to content

[TPU][Core] Enable Pipeline Parallelism on TPU backend#28506

Merged
yaochengji merged 8 commits into
vllm-project:mainfrom
Chenyaaang:pp-vllm
Jan 16, 2026
Merged

[TPU][Core] Enable Pipeline Parallelism on TPU backend#28506
yaochengji merged 8 commits into
vllm-project:mainfrom
Chenyaaang:pp-vllm

Conversation

@Chenyaaang

@Chenyaaang Chenyaaang commented Nov 12, 2025

Copy link
Copy Markdown
Contributor

Enable Pipeline Parallelism on TPU backend

This pr includes changes on vLLM side:

  • multiproc_executor.py: Extract _get_parallel_sizes and _post_init_executor methods, so that TPU (and other hardware) can further override them if necessary.
  • ray_utils.py: Extract _is_intermediate_tensors for TPU to override (to accept jax's version IntermediateTensor)

The command to enable PP on TPU platform is same as other platforms, but PP hasn't been supported on all Jax models, so add env var MODEL_IMPL_TYPE=vllm to use pytorch impl. PP can be used on single host or multi-host (with Ray).

Example command: MODEL_IMPL_TYPE=vllm TPU_BACKEND_TYPE=jax vllm serve Qwen/Qwen3-32B --pipeline-parallel-size 4

@mergify mergify Bot added the v1 label Nov 12, 2025
@Chenyaaang Chenyaaang force-pushed the pp-vllm branch 2 times, most recently from 0af58b8 to 9c427be Compare December 16, 2025 00:02
@Chenyaaang Chenyaaang changed the title Enable PP on tpu_inference Enable Pipeline Parallelism on TPU backend Dec 16, 2025
@Chenyaaang Chenyaaang marked this pull request as ready for review December 16, 2025 19:53
@chatgpt-codex-connector

Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

@Chenyaaang Chenyaaang changed the title Enable Pipeline Parallelism on TPU backend [TPU][Core] Enable Pipeline Parallelism on TPU backend Dec 16, 2025
Comment thread vllm/v1/executor/multiproc_executor.py Outdated

@yaochengji yaochengji left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution! My main concern is that the modification looks a little intrusive to me. I'm wondering if we can add a subclass in the tpu-inference repo.

Comment thread vllm/v1/executor/multiproc_executor.py Outdated
Comment thread vllm/v1/executor/multiproc_executor.py Outdated
Comment thread vllm/v1/executor/multiproc_executor.py Outdated
@mergify

mergify Bot commented Jan 6, 2026

Copy link
Copy Markdown
Contributor

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @Chenyaaang.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@Chenyaaang

Copy link
Copy Markdown
Contributor Author

Thanks for your contribution! My main concern is that the modification looks a little intrusive to me. I'm wondering if we can add a subclass in the tpu-inference repo.

Here's the pr (creating subclass in tpu_inference) vllm-project/tpu-inference#1401

@yaochengji yaochengji added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 13, 2026
@yaochengji

Copy link
Copy Markdown
Collaborator

@mgoin what do you think about the PR? vllm TPU needs to override these 3 methods (_get_parallel_sizes, _is_driver_worker, _is_intermediate_tensors) to support pipeline parallelism

@yaochengji

Copy link
Copy Markdown
Collaborator

@Chenyaaang could you please update your description? get_pp_group doesn't exist in the latest version.

Signed-off-by: Chenyaaang <chenyangli@google.com>

llama debug

Signed-off-by: Chenyaaang <chenyangli@google.com>

core debug

Signed-off-by: Chenyaaang <chenyangli@google.com>

pp for single host

Signed-off-by: Chenyaaang <chenyangli@google.com>

pp single host

Signed-off-by: Chenyaaang <chenyangli@google.com>

pp single host comment

Signed-off-by: Chenyaaang <chenyangli@google.com>

amend single host

Signed-off-by: Chenyaaang <chenyangli@google.com>

single host

Signed-off-by: Chenyaaang <chenyangli@google.com>
Signed-off-by: Chenyaaang <chenyangli@google.com>

amend ray

Signed-off-by: Chenyaaang <chenyangli@google.com>
Signed-off-by: Chenyaaang <chenyangli@google.com>
…circular import

Signed-off-by: Chenyaaang <chenyangli@google.com>
Signed-off-by: Chenyaaang <chenyangli@google.com>
…erride this method

Signed-off-by: Chenyaaang <chenyangli@google.com>
Signed-off-by: Chenyaaang <chenyangli@google.com>
Signed-off-by: Chenyaaang <chenyangli@google.com>

@yaochengji yaochengji left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@yaochengji yaochengji merged commit 484e22b into vllm-project:main Jan 16, 2026
45 checks passed
wangxiyuan pushed a commit to vllm-project/vllm-ascend that referenced this pull request Jan 27, 2026
### What this PR does / why we need it?
1. ✅ Upgrade vllm commit to: 0115
(8471b27)
Modify import paths due to the refactors:
vllm-project/vllm#32245
vllm-project/vllm#32060
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21034239336/job/60490156965?pr=5913
2. ✅Upgrade vllm commit to: 0119
(9a1f16d)
Fix `WorkerProc.__init__() missing 1 required positional argument:
'is_driver_worker'` due to
vllm-project/vllm#28506
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21156263050/job/60841668755?5569
3. ✅Upgrade vllm commit to:
0120(148117e)
1. Add `skip_compiled` param in `set_forward_context` due to
vllm-project/vllm#30385
2. Modify `tests/ut/spec_decode/test_eagle_proposer.py` due to
vllm-project/vllm#24322
change `self.max_num_tokens =
vllm_config.scheduler_config.max_num_batched_tokens + max_batch_size`
3. Modify UT import paths due to the
refactors:vllm-project/vllm#32060
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21204851770/job/60999046946
4. ✅Upgrade vllm commit to:
0121(f23fb5a)
1. vLLM switched `uses_mrope` from target to draft model config, making
`positions`/`mrope_positions` mutually exclusive, breaking vllm-ascend's
direct self.positions access and tests missing
`draft_model_config.uses_mrope`.
vllm-project/vllm#32048
2. Moved bs_to_padded_graph_size from CompilationConfig to
CudagraphDispatcher due to the refactor
vllm-project/vllm#30143
3. Remove unused `maybe_setup_kv_connector` due to
vllm-project/vllm#32077
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21217728738/job/61043738834
6. ✅Upgrade vllm commit to:
0122(8ebf271)
Updating FusedMoEParallelConfig (added enable_eplb) and FusedMoEConfig
due to vllm-project/vllm#32414
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21249922546/job/61148613054
8. ✅Upgrade vllm commit to:
0123(dc917cc)
Setting temperature=0.0 due to the removal of the default temperature
value in vllm-project/vllm#32723
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21280796875
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.14.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: wjunLu <wjunlu217@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Co-authored-by: wjunLu <wjunlu217@gmail.com>
starmountain1997 pushed a commit to starmountain1997/vllm-ascend that referenced this pull request Jan 31, 2026
### What this PR does / why we need it?
1. ✅ Upgrade vllm commit to: 0115
(8471b27)
Modify import paths due to the refactors:
vllm-project/vllm#32245
vllm-project/vllm#32060
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21034239336/job/60490156965?pr=5913
2. ✅Upgrade vllm commit to: 0119
(9a1f16d)
Fix `WorkerProc.__init__() missing 1 required positional argument:
'is_driver_worker'` due to
vllm-project/vllm#28506
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21156263050/job/60841668755?5569
3. ✅Upgrade vllm commit to:
0120(148117e)
1. Add `skip_compiled` param in `set_forward_context` due to
vllm-project/vllm#30385
2. Modify `tests/ut/spec_decode/test_eagle_proposer.py` due to
vllm-project/vllm#24322
change `self.max_num_tokens =
vllm_config.scheduler_config.max_num_batched_tokens + max_batch_size`
3. Modify UT import paths due to the
refactors:vllm-project/vllm#32060
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21204851770/job/60999046946
4. ✅Upgrade vllm commit to:
0121(f23fb5a)
1. vLLM switched `uses_mrope` from target to draft model config, making
`positions`/`mrope_positions` mutually exclusive, breaking vllm-ascend's
direct self.positions access and tests missing
`draft_model_config.uses_mrope`.
vllm-project/vllm#32048
2. Moved bs_to_padded_graph_size from CompilationConfig to
CudagraphDispatcher due to the refactor
vllm-project/vllm#30143
3. Remove unused `maybe_setup_kv_connector` due to
vllm-project/vllm#32077
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21217728738/job/61043738834
6. ✅Upgrade vllm commit to:
0122(8ebf271)
Updating FusedMoEParallelConfig (added enable_eplb) and FusedMoEConfig
due to vllm-project/vllm#32414
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21249922546/job/61148613054
8. ✅Upgrade vllm commit to:
0123(dc917cc)
Setting temperature=0.0 due to the removal of the default temperature
value in vllm-project/vllm#32723
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21280796875
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.14.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: wjunLu <wjunlu217@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Co-authored-by: wjunLu <wjunlu217@gmail.com>
starmountain1997 pushed a commit to starmountain1997/vllm-ascend that referenced this pull request Jan 31, 2026
### What this PR does / why we need it?
1. ✅ Upgrade vllm commit to: 0115
(8471b27)
Modify import paths due to the refactors:
vllm-project/vllm#32245
vllm-project/vllm#32060
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21034239336/job/60490156965?pr=5913
2. ✅Upgrade vllm commit to: 0119
(9a1f16d)
Fix `WorkerProc.__init__() missing 1 required positional argument:
'is_driver_worker'` due to
vllm-project/vllm#28506
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21156263050/job/60841668755?5569
3. ✅Upgrade vllm commit to:
0120(148117e)
1. Add `skip_compiled` param in `set_forward_context` due to
vllm-project/vllm#30385
2. Modify `tests/ut/spec_decode/test_eagle_proposer.py` due to
vllm-project/vllm#24322
change `self.max_num_tokens =
vllm_config.scheduler_config.max_num_batched_tokens + max_batch_size`
3. Modify UT import paths due to the
refactors:vllm-project/vllm#32060
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21204851770/job/60999046946
4. ✅Upgrade vllm commit to:
0121(f23fb5a)
1. vLLM switched `uses_mrope` from target to draft model config, making
`positions`/`mrope_positions` mutually exclusive, breaking vllm-ascend's
direct self.positions access and tests missing
`draft_model_config.uses_mrope`.
vllm-project/vllm#32048
2. Moved bs_to_padded_graph_size from CompilationConfig to
CudagraphDispatcher due to the refactor
vllm-project/vllm#30143
3. Remove unused `maybe_setup_kv_connector` due to
vllm-project/vllm#32077
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21217728738/job/61043738834
6. ✅Upgrade vllm commit to:
0122(8ebf271)
Updating FusedMoEParallelConfig (added enable_eplb) and FusedMoEConfig
due to vllm-project/vllm#32414
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21249922546/job/61148613054
8. ✅Upgrade vllm commit to:
0123(dc917cc)
Setting temperature=0.0 due to the removal of the default temperature
value in vllm-project/vllm#32723
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21280796875
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.14.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: wjunLu <wjunlu217@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Co-authored-by: wjunLu <wjunlu217@gmail.com>
chenchuw886 pushed a commit to chenchuw886/vllm-ascend that referenced this pull request Feb 12, 2026
### What this PR does / why we need it?
1. ✅ Upgrade vllm commit to: 0115
(8471b27)
Modify import paths due to the refactors:
vllm-project/vllm#32245
vllm-project/vllm#32060
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21034239336/job/60490156965?pr=5913
2. ✅Upgrade vllm commit to: 0119
(9a1f16d)
Fix `WorkerProc.__init__() missing 1 required positional argument:
'is_driver_worker'` due to
vllm-project/vllm#28506
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21156263050/job/60841668755?5569
3. ✅Upgrade vllm commit to:
0120(148117e)
1. Add `skip_compiled` param in `set_forward_context` due to
vllm-project/vllm#30385
2. Modify `tests/ut/spec_decode/test_eagle_proposer.py` due to
vllm-project/vllm#24322
change `self.max_num_tokens =
vllm_config.scheduler_config.max_num_batched_tokens + max_batch_size`
3. Modify UT import paths due to the
refactors:vllm-project/vllm#32060
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21204851770/job/60999046946
4. ✅Upgrade vllm commit to:
0121(f23fb5a)
1. vLLM switched `uses_mrope` from target to draft model config, making
`positions`/`mrope_positions` mutually exclusive, breaking vllm-ascend's
direct self.positions access and tests missing
`draft_model_config.uses_mrope`.
vllm-project/vllm#32048
2. Moved bs_to_padded_graph_size from CompilationConfig to
CudagraphDispatcher due to the refactor
vllm-project/vllm#30143
3. Remove unused `maybe_setup_kv_connector` due to
vllm-project/vllm#32077
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21217728738/job/61043738834
6. ✅Upgrade vllm commit to:
0122(8ebf271)
Updating FusedMoEParallelConfig (added enable_eplb) and FusedMoEConfig
due to vllm-project/vllm#32414
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21249922546/job/61148613054
8. ✅Upgrade vllm commit to:
0123(dc917cc)
Setting temperature=0.0 due to the removal of the default temperature
value in vllm-project/vllm#32723
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21280796875
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.14.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: wjunLu <wjunlu217@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Co-authored-by: wjunLu <wjunlu217@gmail.com>
Signed-off-by: momochenchuw <chenchuw@huawei.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Feb 28, 2026
### What this PR does / why we need it?
1. ✅ Upgrade vllm commit to: 0115
(8471b27)
Modify import paths due to the refactors:
vllm-project/vllm#32245
vllm-project/vllm#32060
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21034239336/job/60490156965?pr=5913
2. ✅Upgrade vllm commit to: 0119
(9a1f16d)
Fix `WorkerProc.__init__() missing 1 required positional argument:
'is_driver_worker'` due to
vllm-project/vllm#28506
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21156263050/job/60841668755?5569
3. ✅Upgrade vllm commit to:
0120(148117e)
1. Add `skip_compiled` param in `set_forward_context` due to
vllm-project/vllm#30385
2. Modify `tests/ut/spec_decode/test_eagle_proposer.py` due to
vllm-project/vllm#24322
change `self.max_num_tokens =
vllm_config.scheduler_config.max_num_batched_tokens + max_batch_size`
3. Modify UT import paths due to the
refactors:vllm-project/vllm#32060
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21204851770/job/60999046946
4. ✅Upgrade vllm commit to:
0121(f23fb5a)
1. vLLM switched `uses_mrope` from target to draft model config, making
`positions`/`mrope_positions` mutually exclusive, breaking vllm-ascend's
direct self.positions access and tests missing
`draft_model_config.uses_mrope`.
vllm-project/vllm#32048
2. Moved bs_to_padded_graph_size from CompilationConfig to
CudagraphDispatcher due to the refactor
vllm-project/vllm#30143
3. Remove unused `maybe_setup_kv_connector` due to
vllm-project/vllm#32077
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21217728738/job/61043738834
6. ✅Upgrade vllm commit to:
0122(8ebf271)
Updating FusedMoEParallelConfig (added enable_eplb) and FusedMoEConfig
due to vllm-project/vllm#32414
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21249922546/job/61148613054
8. ✅Upgrade vllm commit to:
0123(dc917cc)
Setting temperature=0.0 due to the removal of the default temperature
value in vllm-project/vllm#32723
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21280796875
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.14.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: wjunLu <wjunlu217@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Co-authored-by: wjunLu <wjunlu217@gmail.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
maoxx241 pushed a commit to maoxx241/vllm-ascend that referenced this pull request Mar 2, 2026
### What this PR does / why we need it?
1. ✅ Upgrade vllm commit to: 0115
(8471b27)
Modify import paths due to the refactors:
vllm-project/vllm#32245
vllm-project/vllm#32060
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21034239336/job/60490156965?pr=5913
2. ✅Upgrade vllm commit to: 0119
(9a1f16d)
Fix `WorkerProc.__init__() missing 1 required positional argument:
'is_driver_worker'` due to
vllm-project/vllm#28506
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21156263050/job/60841668755?5569
3. ✅Upgrade vllm commit to:
0120(148117e)
1. Add `skip_compiled` param in `set_forward_context` due to
vllm-project/vllm#30385
2. Modify `tests/ut/spec_decode/test_eagle_proposer.py` due to
vllm-project/vllm#24322
change `self.max_num_tokens =
vllm_config.scheduler_config.max_num_batched_tokens + max_batch_size`
3. Modify UT import paths due to the
refactors:vllm-project/vllm#32060
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21204851770/job/60999046946
4. ✅Upgrade vllm commit to:
0121(f23fb5a)
1. vLLM switched `uses_mrope` from target to draft model config, making
`positions`/`mrope_positions` mutually exclusive, breaking vllm-ascend's
direct self.positions access and tests missing
`draft_model_config.uses_mrope`.
vllm-project/vllm#32048
2. Moved bs_to_padded_graph_size from CompilationConfig to
CudagraphDispatcher due to the refactor
vllm-project/vllm#30143
3. Remove unused `maybe_setup_kv_connector` due to
vllm-project/vllm#32077
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21217728738/job/61043738834
6. ✅Upgrade vllm commit to:
0122(8ebf271)
Updating FusedMoEParallelConfig (added enable_eplb) and FusedMoEConfig
due to vllm-project/vllm#32414
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21249922546/job/61148613054
8. ✅Upgrade vllm commit to:
0123(dc917cc)
Setting temperature=0.0 due to the removal of the default temperature
value in vllm-project/vllm#32723
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21280796875
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.14.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: wjunLu <wjunlu217@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Co-authored-by: wjunLu <wjunlu217@gmail.com>
ZRJ026 pushed a commit to ZRJ026/vllm-ascend that referenced this pull request Mar 4, 2026
### What this PR does / why we need it?
1. ✅ Upgrade vllm commit to: 0115
(8471b27)
Modify import paths due to the refactors:
vllm-project/vllm#32245
vllm-project/vllm#32060
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21034239336/job/60490156965?pr=5913
2. ✅Upgrade vllm commit to: 0119
(9a1f16d)
Fix `WorkerProc.__init__() missing 1 required positional argument:
'is_driver_worker'` due to
vllm-project/vllm#28506
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21156263050/job/60841668755?5569
3. ✅Upgrade vllm commit to:
0120(148117e)
1. Add `skip_compiled` param in `set_forward_context` due to
vllm-project/vllm#30385
2. Modify `tests/ut/spec_decode/test_eagle_proposer.py` due to
vllm-project/vllm#24322
change `self.max_num_tokens =
vllm_config.scheduler_config.max_num_batched_tokens + max_batch_size`
3. Modify UT import paths due to the
refactors:vllm-project/vllm#32060
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21204851770/job/60999046946
4. ✅Upgrade vllm commit to:
0121(f23fb5a)
1. vLLM switched `uses_mrope` from target to draft model config, making
`positions`/`mrope_positions` mutually exclusive, breaking vllm-ascend's
direct self.positions access and tests missing
`draft_model_config.uses_mrope`.
vllm-project/vllm#32048
2. Moved bs_to_padded_graph_size from CompilationConfig to
CudagraphDispatcher due to the refactor
vllm-project/vllm#30143
3. Remove unused `maybe_setup_kv_connector` due to
vllm-project/vllm#32077
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21217728738/job/61043738834
6. ✅Upgrade vllm commit to:
0122(8ebf271)
Updating FusedMoEParallelConfig (added enable_eplb) and FusedMoEConfig
due to vllm-project/vllm#32414
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21249922546/job/61148613054
8. ✅Upgrade vllm commit to:
0123(dc917cc)
Setting temperature=0.0 due to the removal of the default temperature
value in vllm-project/vllm#32723
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21280796875
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.14.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: wjunLu <wjunlu217@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Co-authored-by: wjunLu <wjunlu217@gmail.com>
Signed-off-by: zrj026 <zhangrunjiang026@gmail.com>
LCAIZJ pushed a commit to LCAIZJ/vllm-ascend that referenced this pull request Mar 7, 2026
### What this PR does / why we need it?
1. ✅ Upgrade vllm commit to: 0115
(8471b27)
Modify import paths due to the refactors:
vllm-project/vllm#32245
vllm-project/vllm#32060
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21034239336/job/60490156965?pr=5913
2. ✅Upgrade vllm commit to: 0119
(9a1f16d)
Fix `WorkerProc.__init__() missing 1 required positional argument:
'is_driver_worker'` due to
vllm-project/vllm#28506
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21156263050/job/60841668755?5569
3. ✅Upgrade vllm commit to:
0120(148117e)
1. Add `skip_compiled` param in `set_forward_context` due to
vllm-project/vllm#30385
2. Modify `tests/ut/spec_decode/test_eagle_proposer.py` due to
vllm-project/vllm#24322
change `self.max_num_tokens =
vllm_config.scheduler_config.max_num_batched_tokens + max_batch_size`
3. Modify UT import paths due to the
refactors:vllm-project/vllm#32060
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21204851770/job/60999046946
4. ✅Upgrade vllm commit to:
0121(f23fb5a)
1. vLLM switched `uses_mrope` from target to draft model config, making
`positions`/`mrope_positions` mutually exclusive, breaking vllm-ascend's
direct self.positions access and tests missing
`draft_model_config.uses_mrope`.
vllm-project/vllm#32048
2. Moved bs_to_padded_graph_size from CompilationConfig to
CudagraphDispatcher due to the refactor
vllm-project/vllm#30143
3. Remove unused `maybe_setup_kv_connector` due to
vllm-project/vllm#32077
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21217728738/job/61043738834
6. ✅Upgrade vllm commit to:
0122(8ebf271)
Updating FusedMoEParallelConfig (added enable_eplb) and FusedMoEConfig
due to vllm-project/vllm#32414
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21249922546/job/61148613054
8. ✅Upgrade vllm commit to:
0123(dc917cc)
Setting temperature=0.0 due to the removal of the default temperature
value in vllm-project/vllm#32723
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21280796875
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.14.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: wjunLu <wjunlu217@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Co-authored-by: wjunLu <wjunlu217@gmail.com>
jiangyunfan1 pushed a commit to jiangyunfan1/vllm-ascend that referenced this pull request Apr 9, 2026
### What this PR does / why we need it?
1. ✅ Upgrade vllm commit to: 0115
(8471b27)
Modify import paths due to the refactors:
vllm-project/vllm#32245
vllm-project/vllm#32060
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21034239336/job/60490156965?pr=5913
2. ✅Upgrade vllm commit to: 0119
(9a1f16d)
Fix `WorkerProc.__init__() missing 1 required positional argument:
'is_driver_worker'` due to
vllm-project/vllm#28506
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21156263050/job/60841668755?5569
3. ✅Upgrade vllm commit to:
0120(148117e)
1. Add `skip_compiled` param in `set_forward_context` due to
vllm-project/vllm#30385
2. Modify `tests/ut/spec_decode/test_eagle_proposer.py` due to
vllm-project/vllm#24322
change `self.max_num_tokens =
vllm_config.scheduler_config.max_num_batched_tokens + max_batch_size`
3. Modify UT import paths due to the
refactors:vllm-project/vllm#32060
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21204851770/job/60999046946
4. ✅Upgrade vllm commit to:
0121(f23fb5a)
1. vLLM switched `uses_mrope` from target to draft model config, making
`positions`/`mrope_positions` mutually exclusive, breaking vllm-ascend's
direct self.positions access and tests missing
`draft_model_config.uses_mrope`.
vllm-project/vllm#32048
2. Moved bs_to_padded_graph_size from CompilationConfig to
CudagraphDispatcher due to the refactor
vllm-project/vllm#30143
3. Remove unused `maybe_setup_kv_connector` due to
vllm-project/vllm#32077
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21217728738/job/61043738834
6. ✅Upgrade vllm commit to:
0122(8ebf271)
Updating FusedMoEParallelConfig (added enable_eplb) and FusedMoEConfig
due to vllm-project/vllm#32414
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21249922546/job/61148613054
8. ✅Upgrade vllm commit to:
0123(dc917cc)
Setting temperature=0.0 due to the removal of the default temperature
value in vllm-project/vllm#32723
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21280796875
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.14.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: wjunLu <wjunlu217@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Co-authored-by: wjunLu <wjunlu217@gmail.com>
yangzhe-2026 pushed a commit to yangzhe-2026/vllm-ascend that referenced this pull request May 6, 2026
### What this PR does / why we need it?
1. ✅ Upgrade vllm commit to: 0115
(8471b27)
Modify import paths due to the refactors:
vllm-project/vllm#32245
vllm-project/vllm#32060
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21034239336/job/60490156965?pr=5913
2. ✅Upgrade vllm commit to: 0119
(9a1f16d)
Fix `WorkerProc.__init__() missing 1 required positional argument:
'is_driver_worker'` due to
vllm-project/vllm#28506
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21156263050/job/60841668755?5569
3. ✅Upgrade vllm commit to:
0120(148117e)
1. Add `skip_compiled` param in `set_forward_context` due to
vllm-project/vllm#30385
2. Modify `tests/ut/spec_decode/test_eagle_proposer.py` due to
vllm-project/vllm#24322
change `self.max_num_tokens =
vllm_config.scheduler_config.max_num_batched_tokens + max_batch_size`
3. Modify UT import paths due to the
refactors:vllm-project/vllm#32060
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21204851770/job/60999046946
4. ✅Upgrade vllm commit to:
0121(f23fb5a)
1. vLLM switched `uses_mrope` from target to draft model config, making
`positions`/`mrope_positions` mutually exclusive, breaking vllm-ascend's
direct self.positions access and tests missing
`draft_model_config.uses_mrope`.
vllm-project/vllm#32048
2. Moved bs_to_padded_graph_size from CompilationConfig to
CudagraphDispatcher due to the refactor
vllm-project/vllm#30143
3. Remove unused `maybe_setup_kv_connector` due to
vllm-project/vllm#32077
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21217728738/job/61043738834
6. ✅Upgrade vllm commit to:
0122(8ebf271)
Updating FusedMoEParallelConfig (added enable_eplb) and FusedMoEConfig
due to vllm-project/vllm#32414
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21249922546/job/61148613054
8. ✅Upgrade vllm commit to:
0123(dc917cc)
Setting temperature=0.0 due to the removal of the default temperature
value in vllm-project/vllm#32723
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21280796875
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.14.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: wjunLu <wjunlu217@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Co-authored-by: wjunLu <wjunlu217@gmail.com>
mystous pushed a commit to mystous/vllm_hybrid that referenced this pull request May 10, 2026
nanxingMy pushed a commit to nanxingMy/vllm-ascend that referenced this pull request May 15, 2026
### What this PR does / why we need it?
1. ✅ Upgrade vllm commit to: 0115
(8471b27)
Modify import paths due to the refactors:
vllm-project/vllm#32245
vllm-project/vllm#32060
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21034239336/job/60490156965?pr=5913
2. ✅Upgrade vllm commit to: 0119
(9a1f16d)
Fix `WorkerProc.__init__() missing 1 required positional argument:
'is_driver_worker'` due to
vllm-project/vllm#28506
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21156263050/job/60841668755?5569
3. ✅Upgrade vllm commit to:
0120(148117e)
1. Add `skip_compiled` param in `set_forward_context` due to
vllm-project/vllm#30385
2. Modify `tests/ut/spec_decode/test_eagle_proposer.py` due to
vllm-project/vllm#24322
change `self.max_num_tokens =
vllm_config.scheduler_config.max_num_batched_tokens + max_batch_size`
3. Modify UT import paths due to the
refactors:vllm-project/vllm#32060
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21204851770/job/60999046946
4. ✅Upgrade vllm commit to:
0121(f23fb5a)
1. vLLM switched `uses_mrope` from target to draft model config, making
`positions`/`mrope_positions` mutually exclusive, breaking vllm-ascend's
direct self.positions access and tests missing
`draft_model_config.uses_mrope`.
vllm-project/vllm#32048
2. Moved bs_to_padded_graph_size from CompilationConfig to
CudagraphDispatcher due to the refactor
vllm-project/vllm#30143
3. Remove unused `maybe_setup_kv_connector` due to
vllm-project/vllm#32077
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21217728738/job/61043738834
6. ✅Upgrade vllm commit to:
0122(8ebf271)
Updating FusedMoEParallelConfig (added enable_eplb) and FusedMoEConfig
due to vllm-project/vllm#32414
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21249922546/job/61148613054
8. ✅Upgrade vllm commit to:
0123(dc917cc)
Setting temperature=0.0 due to the removal of the default temperature
value in vllm-project/vllm#32723
Test result:
https://github.com/vllm-project/vllm-ascend/actions/runs/21280796875
### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

- vLLM version: v0.14.0
- vLLM main:
vllm-project/vllm@d682094

---------

Signed-off-by: wjunLu <wjunlu217@gmail.com>
Signed-off-by: Meihan-chen <jcccx.cmh@gmail.com>
Co-authored-by: wjunLu <wjunlu217@gmail.com>
Signed-off-by: nanxing <1014662416@qq.com>
my-other-github-account pushed a commit to my-other-github-account/vllm that referenced this pull request May 15, 2026
my-other-github-account pushed a commit to my-other-github-account/vllm that referenced this pull request May 15, 2026
0826joyce pushed a commit to 0826joyce/vllm-serving-optimization that referenced this pull request May 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants