[llm] Update vllm to 0.11.0 and Nixl to 0.6.0#57201
[llm] Update vllm to 0.11.0 and Nixl to 0.6.0#57201kouroshHakha merged 9 commits intoray-project:masterfrom
Conversation
Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>
Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>
2309954 to
2c00977
Compare
| # runtime_env=dict( | ||
| # env_vars=dict( | ||
| # VLLM_USE_V1="0", | ||
| # ), | ||
| # ), |
There was a problem hiding this comment.
if this passes the tests we should remove it.
There was a problem hiding this comment.
-
unit tests fixed
-
release tests (probes/test_basic.py::test_logprobs)
logprobs problem: -1 is being interpreted as a vocab size instead of an invalid # for logprobs so all release tests are barfing b/c that probe test fails. -
release tests 1p1d / 2p6d faliing due to nixl error
TypeError: nixl_agent_config.__init__() got an unexpected keyword argument 'num_threads'
num_threads was added to nixl_agent_config between release/0.4.1 and release/0.6.0 at 9f77cc4
Issue: release env is still using nixl=0.4.1 which is causing the break

New issue - nixl linker / backend error
failed to load plugin from /usr/local/nixl/lib/x86_64-linux-gnu/plugins/libplugin_UCX_MO.so: /usr/local/nixl/lib/x86_64-linux-gnu/plugins/libplugin_UCX_MO.so: undefined symbol: _ZN12nixlDescListI12nixlMetaDescEC1ERK10nixl_mem_tRKbRKi
(RayWorkerWrapper pid=128409) INFO 10-07 11:58:24 [factory.py:51] Creating v1 connector with name: NixlConnector and engine_id: 16561289-139e-44af-90e7-1c86a61925b2-10.0.111.115-40773
(RayWorkerWrapper pid=128409) INFO 10-07 11:58:24 [nixl_connector.py:465] Initializing NIXL wrapper
(RayWorkerWrapper pid=128409) INFO 10-07 11:58:24 [nixl_connector.py:466] Initializing NIXL worker 16561289-139e-44af-90e7-1c86a61925b2-10.0.111.115-40773
(RayWorkerWrapper pid=128409) E1007 11:58:24.959858 128409 nixl_plugin_manager.cpp:122] Failed to load plugin from /usr/local/nixl/lib/x86_64-linux-gnu/plugins/libplugin_UCX_MO.so: /usr/local/nixl/lib/x86_64-linux-gnu/plugins/libplugin_UCX_MO.so: undefined symbol: _ZN12nixlDescListI12nixlMetaDescEC1ERK10nixl_mem_tRKbRKi
(RayWorkerWrapper pid=128409) E1007 11:58:24.959885 128409 nixl_plugin_manager.cpp:288] Failed to load plugin 'UCX_MO' from any directory
Resolved by reinstalling nixl (local issue)
- vllm-project/vllm#23868 - PR in vLLM changed interpretation of num_logprobs = -1 - Overrides to model_config.get_vocab_size(), which triggers openai.APIError instead of openai.badRequestError - Test expects the latter, which causes failure - Instead of broadening / changing the expected failure type, we use -2 Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>
4717751 to
e62627f
Compare
Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com> Signed-off-by: nikhil <nikhil@anyscale.com>
35427b5 to
8e6b2ab
Compare
aslonnie
left a comment
There was a problem hiding this comment.
can we wait till 2.50 branch cut to merge this?
|
@aslonnie ideally we can land this for summit? |
I think we want to get this in ASAP, it has a lot of performance improvements |
If we ship the switch to OpenTelemetry without vLLM 0.11.0 we will have no vLLM metrics in Ray 2.50. cc: @can-anyscale |
nrghosh
left a comment
There was a problem hiding this comment.
Unit tests and release tests both passing
Doc build failure seems ephemeral / unrelated, as full doc build succeeds locally with this branch.
Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com> Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com> Signed-off-by: nikhil <nikhil@anyscale.com> Co-authored-by: Nikhil Ghosh <nikhil@anyscale.com> Co-authored-by: Nikhil G <nrghosh@users.noreply.github.com>
Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com> Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com> Signed-off-by: nikhil <nikhil@anyscale.com> Co-authored-by: Nikhil Ghosh <nikhil@anyscale.com> Co-authored-by: Nikhil G <nrghosh@users.noreply.github.com> Signed-off-by: Josh Kodi <joshkodi@gmail.com>
Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com> Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com> Signed-off-by: nikhil <nikhil@anyscale.com> Co-authored-by: Nikhil Ghosh <nikhil@anyscale.com> Co-authored-by: Nikhil G <nrghosh@users.noreply.github.com>
Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com> Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com> Signed-off-by: nikhil <nikhil@anyscale.com> Co-authored-by: Nikhil Ghosh <nikhil@anyscale.com> Co-authored-by: Nikhil G <nrghosh@users.noreply.github.com>
Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com> Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com> Signed-off-by: nikhil <nikhil@anyscale.com> Co-authored-by: Nikhil Ghosh <nikhil@anyscale.com> Co-authored-by: Nikhil G <nrghosh@users.noreply.github.com> Signed-off-by: xgui <xgui@anyscale.com>
Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com> Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com> Signed-off-by: nikhil <nikhil@anyscale.com> Co-authored-by: Nikhil Ghosh <nikhil@anyscale.com> Co-authored-by: Nikhil G <nrghosh@users.noreply.github.com> Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com> Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com> Signed-off-by: nikhil <nikhil@anyscale.com> Co-authored-by: Nikhil Ghosh <nikhil@anyscale.com> Co-authored-by: Nikhil G <nrghosh@users.noreply.github.com>
Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com> Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com> Signed-off-by: nikhil <nikhil@anyscale.com> Co-authored-by: Nikhil Ghosh <nikhil@anyscale.com> Co-authored-by: Nikhil G <nrghosh@users.noreply.github.com> Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com> Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com> Signed-off-by: nikhil <nikhil@anyscale.com> Co-authored-by: Nikhil Ghosh <nikhil@anyscale.com> Co-authored-by: Nikhil G <nrghosh@users.noreply.github.com> Signed-off-by: Future-Outlier <eric901201@gmail.com>
Updating vLLM and Nixl -- notable changes: