[serve] TypeError: argparse.Namespace() got multiple values for keyword argument 'tokens_only'

### What happened + What you expected to happen

This issue is masked by the issue fixed in https://github.com/ray-project/ray/pull/58820, so (currently) only reproducible using nightly builds.  Once #58820 is released, then I suspect it will be more widely reported.

As with #58820, the vLLM release `0.11.1` introduces `tokens_only` arguments to both `FrontendArgs` and `EngineArgs` in [this pull request](https://github.com/vllm-project/vllm/pull/24261).   This code in `VLLMEngine.start()` gathers arguments from both `FrontendArgs` and `EngineArgs`...
https://github.com/ray-project/ray/blob/e1c074254df523c3d17b39216fb386652ead1052/python/ray/llm/_internal/serve/engines/vllm/vllm_engine.py#L197-L200

which then throws the `TypeError` exception:
```
(ServeController pid=2517)   File "/opt/micromamba/envs/runtime/lib/python3.12/site-packages/ray/llm/_internal/serve/core/server/llm_server.py", line 147, in __init__
(ServeController pid=2517)     await self.start()
(ServeController pid=2517)   File "/opt/micromamba/envs/runtime/lib/python3.12/site-packages/ray/llm/_internal/serve/core/server/llm_server.py", line 193, in start
(ServeController pid=2517)     await asyncio.wait_for(self._start_engine(), timeout=ENGINE_START_TIMEOUT_S)
(ServeController pid=2517)   File "/opt/micromamba/envs/runtime/lib/python3.12/asyncio/tasks.py", line 520, in wait_for
(ServeController pid=2517)     return await fut
(ServeController pid=2517)            ^^^^^^^^^
(ServeController pid=2517)   File "/opt/micromamba/envs/runtime/lib/python3.12/site-packages/ray/llm/_internal/serve/core/server/llm_server.py", line 239, in _start_engine
(ServeController pid=2517)     await self.engine.start()
(ServeController pid=2517)   File "/opt/micromamba/envs/runtime/lib/python3.12/site-packages/ray/llm/_internal/serve/engines/vllm/vllm_engine.py", line 197, in start
(ServeController pid=2517)     args = argparse.Namespace(
(ServeController pid=2517)            ^^^^^^^^^^^^^^^^^^^
(ServeController pid=2517) TypeError: argparse.Namespace() got multiple values for keyword argument 'tokens_only'
```

Since it seems reasonable to allow different argument sets to define the same arguments by name, it feels like this should be addressed in `ray[serve]` rather than `vLLM`.  

The work around is to pin `vllm < 0.11.1` (as with #58820).

### Versions / Dependencies

ray[llm] nightly build (but will be an issue once #58820 is released)
vllm >= 0.11.1
python == 3.12

### Reproduction script

I'm sorry, I don't know how to put this together as this is far below our call stack.

### Issue Severity

Medium: It is a significant difficulty but I can work around it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[serve] TypeError: argparse.Namespace() got multiple values for keyword argument 'tokens_only' #58973

What happened + What you expected to happen

Versions / Dependencies

Reproduction script

Issue Severity

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	args = argparse.Namespace(
	**vllm_frontend_args.__dict__,
	**vllm_engine_args.__dict__,
	)

[serve] TypeError: argparse.Namespace() got multiple values for keyword argument 'tokens_only' #58973

Description

What happened + What you expected to happen

Versions / Dependencies

Reproduction script

Issue Severity

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions