[data.llm][Bugfix] Fix doc to only support int concurrency#54196
[data.llm][Bugfix] Fix doc to only support int concurrency#54196kouroshHakha merged 6 commits intoray-project:masterfrom
concurrency#54196Conversation
Signed-off-by: Linkun <github@lkchen.net>
There was a problem hiding this comment.
Pull Request Overview
This PR fixes a bug related to the handling of the tuple concurrency configuration in the vLLM engine processor. The changes ensure that when a tuple is provided, the corresponding min and max sizes are set correctly.
- Updated the test case to pass a tuple for concurrency.
- Modified the processor configuration to correctly extract min and max values from a tuple.
- Adjusted ActorPoolStrategy parameters to reflect fixed or auto‑scaled concurrency.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| release/llm_tests/batch/test_batch_vllm.py | Updated test input to validate tuple concurrency configuration |
| python/ray/llm/_internal/batch/processor/vllm_engine_proc.py | Fixed the processor configuration logic to respect tuple concurrency |
Comments suppressed due to low confidence (1)
release/llm_tests/batch/test_batch_vllm.py:61
- [nitpick] Consider updating the inline comment to accurately reflect that a tuple is provided for concurrency, such as 'concurrency=(1, 2)'.
(1, 2, (1, 2)), # PP=2, concurrency=2
| min_size=config.concurrency | ||
| if isinstance(config.concurrency, int) | ||
| else processor_concurrency[0], | ||
| max_size=processor_concurrency[1], |
There was a problem hiding this comment.
[nitpick] For improved readability, consider refactoring the inline conditional used to derive min_size into a separate variable.
Signed-off-by: Linkun <github@lkchen.net>
| compute=ray.data.ActorPoolStrategy( | ||
| min_size=config.concurrency, | ||
| max_size=config.concurrency, | ||
| # vLLM start up time is significant, so if user give fixed |
There was a problem hiding this comment.
Following up on https://anyscale1.atlassian.net/browse/CI-1155?focusedCommentId=35311&sourceType=mention?atlOrigin%3D93e50a61c54043ee873dca7979b3f9cd
We should 1) update the doc for concurrency that (m,n) is not supported. 2) raise a validation error at config level if anything but an int is passed in.
concurrency configconcurrency
Signed-off-by: Linkun <github@lkchen.net> Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
…ject#54196) Signed-off-by: Linkun <github@lkchen.net> Signed-off-by: Douglas Strodtman <douglas@anyscale.com>
Why are these changes needed?
This PR removes tuple
concurrencysupport from API docCloses https://anyscale1.atlassian.net/browse/CI-1155
Related issue number
Checks
git commit -s) in this PR.scripts/format.shto lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/under thecorresponding
.rstfile.