Skip to content

[BugFix] Partial revert of #29558 (DeepEP HT + PIECEWISE CG support)#30910

Merged
khluu merged 6 commits intovllm-project:mainfrom
neuralmagic:lwilkinson/partial-revert
Dec 18, 2025
Merged

[BugFix] Partial revert of #29558 (DeepEP HT + PIECEWISE CG support)#30910
khluu merged 6 commits intovllm-project:mainfrom
neuralmagic:lwilkinson/partial-revert

Conversation

@LucasWilkinson
Copy link
Copy Markdown
Collaborator

@LucasWilkinson LucasWilkinson commented Dec 17, 2025

Partially revert #29558 as this broke H200 tests

https://buildkite.com/vllm/ci/builds/43863#019b29e9-5c1b-4eff-83f7-c8304f774aa7

i.e.

VLLM_ALL2ALL_BACKEND=deepep_high_throughput VLLM_USE_DEEP_GEMM=1 VLLM_LOGGING_LEVEL=DEBUG python3 examples/offline_inference/data_parallel.py --model Qwen/Qwen1.5-MoE-A2.7B --tp-size=1 --dp-size=2 --max-model-len 2048

There seems to be multiple issues here so we will try to follow up with a proper fix to restore the PIECEWISE CG support,

  1. torch.compile does not support Size as output meaning the pattern:
orig_shape = hidden_states.shape
...
final_hidden_states = self.experts(              <=== Splitting op!!!!
   hidden_states=hidden_states, router_logits=router_logits
)
...
return final_hidden_states.view(orig_shape)

breaks torch.compile and is common in many MoE model definitions

doing:

final_hidden_states = self.experts(              <=== Splitting op!!!!
   hidden_states=hidden_states, router_logits=router_logits
)
...
orig_shape = hidden_states.shape
return final_hidden_states.view(orig_shape)

can fix this but requires updating all the MoE definitions (and creates a footgun)

  1. the outputs do not seem to have consistent addresses leading to garbage outputs

Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly addresses a bug introduced in a previous change by partially reverting it. The original change, which enabled piecewise CUDA graphs for the DeepEP high-throughput backend, caused issues with H200 tests. The fix is to disable CUDA graphs entirely for this specific configuration (deepep_high_throughput with data parallelism > 1), which is a safe and effective solution. The corresponding tests for the reverted feature have also been removed. The changes are clear and well-justified. I have one minor suggestion to fix a typo in a log message for better clarity.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
@LucasWilkinson LucasWilkinson added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 17, 2025
@mergify
Copy link
Copy Markdown

mergify bot commented Dec 17, 2025

Hi @LucasWilkinson, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?
mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
@LucasWilkinson LucasWilkinson changed the title [BugFix] Partial revert of https://github.com/vllm-project/vllm/pull/29558 [BugFix] Partial revert of #29558 (DeepEP HT + PIECEWISE CG support) Dec 17, 2025
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
@yewentao256
Copy link
Copy Markdown
Member

#30914
Another fix

Copy link
Copy Markdown
Member

@yewentao256 yewentao256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this! We can land this first to unblock CI, and I can fix this issue thoroughly in #30914 later

@khluu khluu merged commit 30bb19a into vllm-project:main Dec 18, 2025
48 checks passed
khluu pushed a commit that referenced this pull request Dec 18, 2025
…30910)

Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
(cherry picked from commit 30bb19a)
@yewentao256 yewentao256 deleted the lwilkinson/partial-revert branch December 18, 2025 14:45
yewentao256 added a commit that referenced this pull request Dec 18, 2025
…upport) (#30910)"

This reverts commit 30bb19a.

Signed-off-by: yewentao256 <zhyanwentao@126.com>
yugong333 pushed a commit to yugong333/vllm that referenced this pull request Dec 22, 2025
…CG support) (vllm-project#30910)

Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Majid-Taheri pushed a commit to Majid-Taheri/vllm that referenced this pull request Dec 23, 2025
…CG support) (vllm-project#30910)

Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Ubuntu <mjtaheri68@gmail.com>
fort726 pushed a commit to fort726/vllm that referenced this pull request Jan 6, 2026
…CG support) (vllm-project#30910)

Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
dsuhinin pushed a commit to dsuhinin/vllm that referenced this pull request Jan 21, 2026
…CG support) (vllm-project#30910)

Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
ItzDEXX pushed a commit to ItzDEXX/vllm that referenced this pull request Feb 19, 2026
…CG support) (vllm-project#30910)

Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants