Skip to content

[train] expose training input/output in callbacks#53869

Merged
matthewdeng merged 10 commits intoray-project:masterfrom
matthewdeng:callback-input-output
Jun 18, 2025
Merged

[train] expose training input/output in callbacks#53869
matthewdeng merged 10 commits intoray-project:masterfrom
matthewdeng:callback-input-output

Conversation

@matthewdeng
Copy link
Copy Markdown
Contributor

@matthewdeng matthewdeng commented Jun 17, 2025

Changes

  • Made TrainRunContext frozen and added all training inputs (configs, datasets, etc.)
  • Changed TrainContext to contain TrainRunContext instead of inheriting from it
  • Added after_controller_finish callback to expose final training results
  • Modified after_controller_start to receive TrainRunContext parameter

TODO

  • Remove redundant TrainRunContext from Callback initializations
  • Refactor logging to handle both TrainContext and TrainRunContext
  • Fix/add tests

Signed-off-by: Matthew Deng <matt@anyscale.com>
Signed-off-by: Matthew Deng <matt@anyscale.com>
Copy link
Copy Markdown
Contributor

@justinvyu justinvyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me overall!

Comment on lines +37 to +38
# The configuration passed to the training function.
train_loop_config: Optional[Dict[str, Any]]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we also add some metadata like ray version/commit, backend version (torch version)?

Signed-off-by: Matthew Deng <matt@anyscale.com>
Signed-off-by: Matthew Deng <matt@anyscale.com>
Signed-off-by: Matthew Deng <matt@anyscale.com>
Signed-off-by: Matthew Deng <matt@anyscale.com>
Signed-off-by: Matthew Deng <matt@anyscale.com>
Signed-off-by: Matthew Deng <matt@anyscale.com>
@matthewdeng matthewdeng added the go add ONLY when ready to merge, run all tests label Jun 18, 2025
Signed-off-by: Matthew Deng <matt@anyscale.com>
Signed-off-by: Matthew Deng <matt@anyscale.com>
@matthewdeng matthewdeng marked this pull request as ready for review June 18, 2025 22:17
@matthewdeng matthewdeng requested a review from a team as a code owner June 18, 2025 22:17
@matthewdeng matthewdeng merged commit 2ccc18d into ray-project:master Jun 18, 2025
5 checks passed
minerharry pushed a commit to minerharry/ray that referenced this pull request Jun 27, 2025
- Made `TrainRunContext` frozen and added all training inputs (configs,
datasets, etc.)
- Changed `TrainContext` to contain `TrainRunContext` instead of
inheriting from it
- Added `after_controller_finish` callback to expose final training
results
- Modified `after_controller_start` to receive `TrainRunContext`
parameter

---------

Signed-off-by: Matthew Deng <matt@anyscale.com>
elliot-barn pushed a commit that referenced this pull request Jul 2, 2025
- Made `TrainRunContext` frozen and added all training inputs (configs,
datasets, etc.)
- Changed `TrainContext` to contain `TrainRunContext` instead of
inheriting from it
- Added `after_controller_finish` callback to expose final training
results
- Modified `after_controller_start` to receive `TrainRunContext`
parameter

---------

Signed-off-by: Matthew Deng <matt@anyscale.com>
Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants