Skip to content

[train] Add worker group setup finished log#52120

Merged
justinvyu merged 3 commits intoray-project:masterfrom
justinvyu:worker_group_init_log
Apr 9, 2025
Merged

[train] Add worker group setup finished log#52120
justinvyu merged 3 commits intoray-project:masterfrom
justinvyu:worker_group_init_log

Conversation

@justinvyu
Copy link
Copy Markdown
Contributor

@justinvyu justinvyu commented Apr 9, 2025

Summary

Add back the info message when the worker group has been initialized successfully. Right now there's only an "Attempting to launch X workers," but no confirmation once the workers are actually created.

Example output

(TrainController pid=55606) Attempting to start training worker group of size 4 with the following resources: [{'CPU': 1}] * 4
(TrainController pid=55606) Started training worker group of size 4: 
(TrainController pid=55606) - (ip=127.0.0.1, pid=55616) world_rank=0, local_rank=0, node_rank=0
(TrainController pid=55606) - (ip=127.0.0.1, pid=55617) world_rank=1, local_rank=1, node_rank=0
(TrainController pid=55606) - (ip=127.0.0.1, pid=55618) world_rank=2, local_rank=2, node_rank=0
(TrainController pid=55606) - (ip=127.0.0.1, pid=55619) world_rank=3, local_rank=3, node_rank=0

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
@matthewdeng
Copy link
Copy Markdown
Contributor

Can you add example output in the PR description?

@justinvyu
Copy link
Copy Markdown
Contributor Author

@matthewdeng It's the same as v1. Any improvements you had in mind?

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
@justinvyu justinvyu marked this pull request as ready for review April 9, 2025 20:05
@justinvyu justinvyu enabled auto-merge (squash) April 9, 2025 20:05
@github-actions github-actions bot added the go add ONLY when ready to merge, run all tests label Apr 9, 2025
@justinvyu justinvyu merged commit ef04149 into ray-project:master Apr 9, 2025
7 checks passed
@justinvyu justinvyu deleted the worker_group_init_log branch April 9, 2025 21:09
han-steve pushed a commit to han-steve/ray that referenced this pull request Apr 11, 2025
## Summary

Add back the info message when the worker group has been initialized
successfully. Right now there's only an "Attempting to launch X
workers," but no confirmation once the workers are actually created.

---------

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Steve Han <stevehan2001@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-backlog go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants