[BugFix] Partial revert of #29558 (DeepEP HT + PIECEWISE CG support)#30910
[BugFix] Partial revert of #29558 (DeepEP HT + PIECEWISE CG support)#30910khluu merged 6 commits intovllm-project:mainfrom
Conversation
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
There was a problem hiding this comment.
Code Review
This pull request correctly addresses a bug introduced in a previous change by partially reverting it. The original change, which enabled piecewise CUDA graphs for the DeepEP high-throughput backend, caused issues with H200 tests. The fix is to disable CUDA graphs entirely for this specific configuration (deepep_high_throughput with data parallelism > 1), which is a safe and effective solution. The corresponding tests for the reverted feature have also been removed. The changes are clear and well-justified. I have one minor suggestion to fix a typo in a log message for better clarity.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
|
Hi @LucasWilkinson, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
|
#30914 |
There was a problem hiding this comment.
Thanks for catching this! We can land this first to unblock CI, and I can fix this issue thoroughly in #30914 later
…CG support) (vllm-project#30910) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com> Signed-off-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
…CG support) (vllm-project#30910) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com> Signed-off-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Ubuntu <mjtaheri68@gmail.com>
…CG support) (vllm-project#30910) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com> Signed-off-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
…CG support) (vllm-project#30910) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com> Signed-off-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
…CG support) (vllm-project#30910) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com> Signed-off-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Partially revert #29558 as this broke H200 tests
https://buildkite.com/vllm/ci/builds/43863#019b29e9-5c1b-4eff-83f7-c8304f774aa7
i.e.
There seems to be multiple issues here so we will try to follow up with a proper fix to restore the PIECEWISE CG support,
breaks torch.compile and is common in many MoE model definitions
doing:
can fix this but requires updating all the MoE definitions (and creates a footgun)