-
Notifications
You must be signed in to change notification settings - Fork 7.4k
[serve] TypeError: argparse.Namespace() got multiple values for keyword argument 'tokens_only' #58973
Description
What happened + What you expected to happen
This issue is masked by the issue fixed in #58820, so (currently) only reproducible using nightly builds. Once #58820 is released, then I suspect it will be more widely reported.
As with #58820, the vLLM release 0.11.1 introduces tokens_only arguments to both FrontendArgs and EngineArgs in this pull request. This code in VLLMEngine.start() gathers arguments from both FrontendArgs and EngineArgs...
ray/python/ray/llm/_internal/serve/engines/vllm/vllm_engine.py
Lines 197 to 200 in e1c0742
| args = argparse.Namespace( | |
| **vllm_frontend_args.__dict__, | |
| **vllm_engine_args.__dict__, | |
| ) |
which then throws the TypeError exception:
(ServeController pid=2517) File "/opt/micromamba/envs/runtime/lib/python3.12/site-packages/ray/llm/_internal/serve/core/server/llm_server.py", line 147, in __init__
(ServeController pid=2517) await self.start()
(ServeController pid=2517) File "/opt/micromamba/envs/runtime/lib/python3.12/site-packages/ray/llm/_internal/serve/core/server/llm_server.py", line 193, in start
(ServeController pid=2517) await asyncio.wait_for(self._start_engine(), timeout=ENGINE_START_TIMEOUT_S)
(ServeController pid=2517) File "/opt/micromamba/envs/runtime/lib/python3.12/asyncio/tasks.py", line 520, in wait_for
(ServeController pid=2517) return await fut
(ServeController pid=2517) ^^^^^^^^^
(ServeController pid=2517) File "/opt/micromamba/envs/runtime/lib/python3.12/site-packages/ray/llm/_internal/serve/core/server/llm_server.py", line 239, in _start_engine
(ServeController pid=2517) await self.engine.start()
(ServeController pid=2517) File "/opt/micromamba/envs/runtime/lib/python3.12/site-packages/ray/llm/_internal/serve/engines/vllm/vllm_engine.py", line 197, in start
(ServeController pid=2517) args = argparse.Namespace(
(ServeController pid=2517) ^^^^^^^^^^^^^^^^^^^
(ServeController pid=2517) TypeError: argparse.Namespace() got multiple values for keyword argument 'tokens_only'
Since it seems reasonable to allow different argument sets to define the same arguments by name, it feels like this should be addressed in ray[serve] rather than vLLM.
The work around is to pin vllm < 0.11.1 (as with #58820).
Versions / Dependencies
ray[llm] nightly build (but will be an issue once #58820 is released)
vllm >= 0.11.1
python == 3.12
Reproduction script
I'm sorry, I don't know how to put this together as this is far below our call stack.
Issue Severity
Medium: It is a significant difficulty but I can work around it.