[docs][data.llm] simplify / add ray data.llm quickstart example#58330
[docs][data.llm] simplify / add ray data.llm quickstart example#58330kouroshHakha merged 2 commits intoray-project:masterfrom
Conversation
Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>
There was a problem hiding this comment.
Code Review
This pull request introduces a valuable minimal quickstart example for ray.data.llm, which significantly simplifies the learning curve for new users. The example is clear and well-documented. My feedback includes a couple of suggestions to enhance the example's robustness: one to make the Ray initialization more resilient for interactive sessions, and another to adjust the batch size for broader compatibility with different GPU configurations. These changes will help ensure a smoother out-of-the-box experience for users.
| from ray.data.llm import vLLMEngineProcessorConfig, build_llm_processor | ||
|
|
||
| # Initialize Ray | ||
| ray.init() |
There was a problem hiding this comment.
It's a good practice in documentation examples to use ray.init(ignore_reinit_error=True). This prevents errors if a user runs the script multiple times in an interactive environment like a Jupyter notebook, where Ray might have already been initialized.
| ray.init() | |
| ray.init(ignore_reinit_error=True) |
| config = vLLMEngineProcessorConfig( | ||
| model_source="unsloth/Llama-3.1-8B-Instruct", | ||
| concurrency=1, # 1 vLLM engine replica | ||
| batch_size=32, # 32 samples per batch |
There was a problem hiding this comment.
A batch_size of 32 might be too large for some GPUs when running an 8B model, potentially leading to out-of-memory errors. For a quickstart example, it's safer to start with a smaller batch size, for example 16, and let users increase it if their hardware allows.
| batch_size=32, # 32 samples per batch | |
| batch_size=16, # 16 samples per batch |
|
|
||
| The processor expects input rows with a ``prompt`` field and outputs rows with both ``prompt`` and ``response`` fields. You can consume results using ``iter_rows()``, ``take()``, ``show()``, or save to files with ``write_parquet()``. | ||
|
|
||
| For more configuration options and advanced features, see the sections below. | ||
|
|
||
| .. _batch_inference_llm: |
There was a problem hiding this comment.
Can you also deduplicate the content from the section below?
Feel like there's some redundancy, like the installation and basic explanation of the configuration.
There was a problem hiding this comment.
done + sglang engine pointer added
Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>
…-project#58330) Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>
…-project#58330) Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>
…-project#58330) Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com> Signed-off-by: Aydin Abiar <aydin@anyscale.com>
…-project#58330) Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com> Signed-off-by: YK <1811651+ykdojo@users.noreply.github.com>
…-project#58330) Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com>
…-project#58330) Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com> Signed-off-by: Future-Outlier <eric901201@gmail.com>
…-project#58330) Signed-off-by: Nikhil Ghosh <nikhil@anyscale.com> Signed-off-by: peterxcli <peterxcli@gmail.com>
LLM Data documentation jumps quickly into detailed / complex examples with lots of configuration and steps.
This PR adds a simpler minimal quick-start to the top of the documentation.
Note: will / can update after the larger ray data.llm api refactor is done (context: #58298)