Skip to content

[serve][llm] Disable model downloading for RunAI streamer, introduce optimized download function#57854

Merged
kouroshHakha merged 9 commits intoray-project:masterfrom
hao-aaron:startup-optimizations
Oct 28, 2025
Merged

[serve][llm] Disable model downloading for RunAI streamer, introduce optimized download function#57854
kouroshHakha merged 9 commits intoray-project:masterfrom
hao-aaron:startup-optimizations

Conversation

@hao-aaron
Copy link
Copy Markdown
Contributor

Description

When running serve llm with runai streamer, current codepath unnecessarily downloads model first. Also, current model download function is not parallelized.

Changes:

  • change worker_node_download_model depending on load_format in LLMConfig.engine_kwargs
  • add new download_model_parallel function which is used by CloudDownloader callback
  • add unit tests for LLMConfig to ensure download model is set correctly

Signed-off-by: ahao-anyscale <ahao@anyscale.com>
@hao-aaron hao-aaron requested a review from a team as a code owner October 17, 2025 19:36
cursor[bot]

This comment was marked as outdated.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces two main improvements: disabling model downloads for streaming loaders like runai_streamer and optimizing cloud downloads by parallelizing them. The logic to conditionally disable downloads based on load_format is well-implemented and tested. The new parallel download function download_files_parallel correctly uses pyarrow for efficient transfers. However, I've found a critical typo that would cause runtime failures and a high-severity issue in error handling that could lead to silent partial downloads. I've also suggested a minor clarification to a docstring. Once these points are addressed, the PR will be in great shape.

@kouroshHakha kouroshHakha changed the title [serve.llm] Disable model downloading for RunAI streamer, introduce optimized download function [serve][llm] Disable model downloading for RunAI streamer, introduce optimized download function Oct 17, 2025
Signed-off-by: ahao-anyscale <ahao@anyscale.com>
cursor[bot]

This comment was marked as outdated.

Signed-off-by: ahao-anyscale <ahao@anyscale.com>
@hao-aaron hao-aaron force-pushed the startup-optimizations branch from 7f73524 to 1405e2a Compare October 17, 2025 20:28
@ray-gardener ray-gardener bot added the serve Ray Serve Related Issue label Oct 18, 2025
Signed-off-by: ahao-anyscale <ahao@anyscale.com>
Copy link
Copy Markdown
Contributor

@ruisearch42 ruisearch42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM

…ions

Signed-off-by: ahao-anyscale <ahao@anyscale.com>
cursor[bot]

This comment was marked as outdated.

cursor[bot]

This comment was marked as outdated.

Signed-off-by: ahao-anyscale <ahao@anyscale.com>
cursor[bot]

This comment was marked as outdated.

Signed-off-by: ahao-anyscale <ahao@anyscale.com>
@hao-aaron
Copy link
Copy Markdown
Contributor Author

failing lmcache bug unrelated to this PR, see LMCache/LMCache#1768

Copy link
Copy Markdown
Contributor

@ruisearch42 ruisearch42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some readability & cleanness suggestions

Signed-off-by: ahao-anyscale <ahao@anyscale.com>
Copy link
Copy Markdown
Contributor

@ruisearch42 ruisearch42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the comments

Copy link
Copy Markdown
Contributor

@angelinalg angelinalg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stamp

@kouroshHakha kouroshHakha added the go add ONLY when ready to merge, run all tests label Oct 28, 2025
@kouroshHakha
Copy link
Copy Markdown
Contributor

Hehe Why did no one press go label :D

@kouroshHakha kouroshHakha enabled auto-merge (squash) October 28, 2025 21:51
@kouroshHakha kouroshHakha merged commit 29ba2ab into ray-project:master Oct 28, 2025
7 of 8 checks passed
YoussefEssDS pushed a commit to YoussefEssDS/ray that referenced this pull request Nov 8, 2025
…optimized download function (ray-project#57854)

Signed-off-by: ahao-anyscale <ahao@anyscale.com>
landscapepainter pushed a commit to landscapepainter/ray that referenced this pull request Nov 17, 2025
…optimized download function (ray-project#57854)

Signed-off-by: ahao-anyscale <ahao@anyscale.com>
Aydin-ab pushed a commit to Aydin-ab/ray-aydin that referenced this pull request Nov 19, 2025
…optimized download function (ray-project#57854)

Signed-off-by: ahao-anyscale <ahao@anyscale.com>
Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Future-Outlier pushed a commit to Future-Outlier/ray that referenced this pull request Dec 7, 2025
…optimized download function (ray-project#57854)

Signed-off-by: ahao-anyscale <ahao@anyscale.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
peterxcli pushed a commit to peterxcli/ray that referenced this pull request Feb 25, 2026
…optimized download function (ray-project#57854)

Signed-off-by: ahao-anyscale <ahao@anyscale.com>
Signed-off-by: peterxcli <peterxcli@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests serve Ray Serve Related Issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants