[serve][llm] update vllm_engine.py to check for VLLM_USE_V1 attribute#58820
[serve][llm] update vllm_engine.py to check for VLLM_USE_V1 attribute#58820kouroshHakha merged 1 commit intoray-project:masterfrom
Conversation
Signed-off-by: Eric Laufer <thiboeri@gmail.com>
…ray-project#58820) Signed-off-by: Eric Laufer <thiboeri@gmail.com>
|
@nrghosh, @eicherseiji, @kouroshHakha - thank you for maintaining this repository! I suspect this didn't make the 2.52.0 release because it was merged a tad too close to the release. Might there be a chance this could be included in the next patch release? If so, is there an eta for that? |
…ray-project#58820) Signed-off-by: Eric Laufer <thiboeri@gmail.com> Signed-off-by: YK <1811651+ykdojo@users.noreply.github.com>
…ray-project#58820) Signed-off-by: Eric Laufer <thiboeri@gmail.com>
Hi @kevin-bates! 2.52.1 was released a few days ago, I believe this commit is included: ray-2.52.1...master Cheers |
|
@nrghosh I don't think this made it into the 2.52.1, I can neither see it under 2.52.1 tagged tree nor in the list you posted. Do we expect a new bug-fix such as 2.52.2 soon? I am not aware of any workaround here. |
|
Hi @kpal-lilt - as a workaround you could use a nightly image |
|
@nrghosh Thanks for the advice, that's what I was trying, but nightly builds are a bet and definitely not for production. The yesterdays nightly builds didn't work for me because it broke other APIs :/ Do we have an insight if a new fix release will happen for 2.52? |
Related prs that we should review when upgrading fully: - #58820 - Note from Rui: When we bump new vllm version, we should go with 0.11.2 instead of 0.11.1, which fixes a Ray multi-node PP regression that was introduced when adding torch-based PP https://github.com/vllm-project/vllm/releases/tag/v0.11.2 Issues: - closes #58937 - closes #58973 - closes #58702 --------- Signed-off-by: Kourosh Hakhamaneshi <Kourosh@anyscale.com> Signed-off-by: Seiji Eicher <seiji@anyscale.com> Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com> Signed-off-by: Nikhil G <nrghosh@users.noreply.github.com> Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com> Co-authored-by: Seiji Eicher <seiji@anyscale.com> Co-authored-by: Nikhil Ghosh <nikhil@anyscale.com> Co-authored-by: Nikhil G <nrghosh@users.noreply.github.com> Co-authored-by: elliot-barn <elliot.barnwell@anyscale.com>
…ray-project#58820) Signed-off-by: Eric Laufer <thiboeri@gmail.com> Signed-off-by: peterxcli <peterxcli@gmail.com>
Related prs that we should review when upgrading fully: - ray-project#58820 - Note from Rui: When we bump new vllm version, we should go with 0.11.2 instead of 0.11.1, which fixes a Ray multi-node PP regression that was introduced when adding torch-based PP https://github.com/vllm-project/vllm/releases/tag/v0.11.2 Issues: - closes ray-project#58937 - closes ray-project#58973 - closes ray-project#58702 --------- Signed-off-by: Kourosh Hakhamaneshi <Kourosh@anyscale.com> Signed-off-by: Seiji Eicher <seiji@anyscale.com> Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com> Signed-off-by: Nikhil G <nrghosh@users.noreply.github.com> Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com> Co-authored-by: Seiji Eicher <seiji@anyscale.com> Co-authored-by: Nikhil Ghosh <nikhil@anyscale.com> Co-authored-by: Nikhil G <nrghosh@users.noreply.github.com> Co-authored-by: elliot-barn <elliot.barnwell@anyscale.com> Signed-off-by: peterxcli <peterxcli@gmail.com>
Description
vLLM 0.11.1 has removed the use vLLM v1 flag, and this causes serve to crash when using LLMConfig.
Related issues
TODO: I will look for related or open an issue.
Additional information
Are there additional checks that should happen?