[train] Fold `v2.XGBoostTrainer` API into the public trainer class as an alternate constructor#50045

Merged

justinvyu merged 17 commits intoray-project:masterfrom

justinvyu:xgboost/merge_v2

Mar 11, 2025

Contributor

justinvyu commented Jan 23, 2025

Summary

Currently, the new XGBoostTrainer API is only accessible with a separate import ray.train.xgboost.v2.XGBoostTrainer.

To avoid unnecessary import changes, this PR folds the new API, which accepts new arguments (train_loop_per_worker, train_loop_config, xgboost_config), into the public ray.train.xgboost.XGBoostTrainer class.

This also makes some changes in the Ray Train v2 XGBoostTrainer class to improve the migration UX, since it does not support the legacy XGBoostTrainer API at all.

TODO

Do the same for lightgbm after a review from the team
Fill out the github issue [train] XGBoostTrainer and LightGBMTrainer API revamps #50042

justinvyu added 3 commits

January 23, 2025 11:49


          fold v2 api into v1 xgboost trainer

6f41da4

Signed-off-by: Justin Yu <justinvyu@anyscale.com>


          remove dmatrix_params

76eb7f3

Signed-off-by: Justin Yu <justinvyu@anyscale.com>


          update v2 xgb trainer

f024edc

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

justinvyu requested review from hongpeng-guo, matthewdeng, raulchen and woshiyyya as code owners

January 23, 2025 21:52


          log deprecation warning

8ac15ae

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

matthewdeng reviewed

View reviewed changes

Contributor

matthewdeng left a comment

This is very elegant!

python/ray/train/v2/xgboost/xgboost_trainer.py Outdated Show resolved Hide resolved

python/ray/train/v2/xgboost/xgboost_trainer.py

Comment on lines +130 to +133

+                      # TODO(justinvyu): [Deprecated] Legacy XGBoostTrainer API
+                      label_column: Optional[str] = None,
+                      params: Optional[Dict[str, Any]] = None,
+                      num_boost_round: Optional[int] = None,

Contributor

matthewdeng Jan 31, 2025

Are these needed for the V2 API? These are not passed in from the V1 API.

Contributor Author

justinvyu Feb 7, 2025

This is train/v2/xgboost, not train/xgboost/v2, so people who use RAY_TRAIN_V2_ENABLED would possibly be passing in the xgboost V1 API. 😂 😭 💀 🪦

python/ray/train/xgboost/xgboost_trainer.py Outdated Show resolved Hide resolved

hongpeng-guo approved these changes

View reviewed changes

Contributor

hongpeng-guo left a comment

Nice!

python/ray/train/xgboost/xgboost_trainer.py Show resolved Hide resolved

justinvyu commented

View reviewed changes

python/ray/train/v2/xgboost/xgboost_trainer.py Outdated Show resolved Hide resolved

python/ray/train/v2/xgboost/xgboost_trainer.py

Comment on lines +130 to +133

+                      # TODO(justinvyu): [Deprecated] Legacy XGBoostTrainer API
+                      label_column: Optional[str] = None,
+                      params: Optional[Dict[str, Any]] = None,
+                      num_boost_round: Optional[int] = None,

Contributor Author

justinvyu Feb 7, 2025

This is train/v2/xgboost, not train/xgboost/v2, so people who use RAY_TRAIN_V2_ENABLED would possibly be passing in the xgboost V1 API. 😂 😭 💀 🪦

python/ray/train/xgboost/xgboost_trainer.py Outdated Show resolved Hide resolved

python/ray/train/xgboost/xgboost_trainer.py

Comment on lines +191 to +205

    
                      train_loop_per_worker: Optional[

                          Union[Callable[[], None], Callable[[Dict], None]]

                      ] = None,

                      train_loop_config: Optional[Dict] = None,

                      xgboost_config: Optional[XGBoostConfig] = None,

                      scaling_config: Optional[ray.train.ScalingConfig] = None,

                      run_config: Optional[ray.train.RunConfig] = None,

                      datasets: Optional[Dict[str, GenDataset]] = None,

                      dataset_config: Optional[ray.train.DataConfig] = None,

                      resume_from_checkpoint: Optional[Checkpoint] = None,

                      metadata: Optional[Dict[str, Any]] = None,

                      # TODO(justinvyu): [Deprecated] Legacy XGBoostTrainer API

                      label_column: Optional[str] = None,

                      params: Optional[Dict[str, Any]] = None,

                      num_boost_round: Optional[int] = None,

Contributor Author

justinvyu Feb 7, 2025

Note: it's fine to change the ordering of these params because they were already forced to be kwargs.

python/ray/train/xgboost/xgboost_trainer.py Show resolved Hide resolved

python/ray/train/v2/xgboost/xgboost_trainer.py Outdated Show resolved Hide resolved

justinvyu added 6 commits

March 7, 2025 15:53


          Merge branch 'master' of https://github.com/ray-project/ray into xgbo…

f9348b2

…ost/merge_v2

Signed-off-by: Justin Yu <justinvyu@anyscale.com>


          Add some deprecation warnings that should be uncommented later

e647f08

Signed-off-by: Justin Yu <justinvyu@anyscale.com>


          kwarg is no longer needed

bb57cfe

Signed-off-by: Justin Yu <justinvyu@anyscale.com>


          typo

34925dd

Signed-off-by: Justin Yu <justinvyu@anyscale.com>


          remove xgboost.v2 imports

6cdc470

Signed-off-by: Justin Yu <justinvyu@anyscale.com>


          remove another xgboost.v2 import

eba9896

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

matthewdeng approved these changes

View reviewed changes

Contributor

matthewdeng left a comment

neat!

python/ray/train/xgboost/xgboost_trainer.py Outdated Show resolved Hide resolved

python/ray/train/xgboost/xgboost_trainer.py Show resolved Hide resolved

justinvyu added 3 commits

March 10, 2025 23:30


          Merge branch 'master' of https://github.com/ray-project/ray into xgbo…

c7614d1

…ost/merge_v2


          fix commented elif

ea428d9

Signed-off-by: Justin Yu <justinvyu@anyscale.com>


          edit deprecation msg

30b2376

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

justinvyu enabled auto-merge (squash)

March 11, 2025 06:35

github-actions bot added the go label

justinvyu added 3 commits

March 11, 2025 08:54


          fix some warnings

1485ee7

Signed-off-by: Justin Yu <justinvyu@anyscale.com>


          Merge branch 'master' of https://github.com/ray-project/ray into xgbo…

615e139

…ost/merge_v2


          fix test

1de33ff

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

github-actions bot disabled auto-merge

March 11, 2025 16:04


          delete crazy test

08f9188

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

justinvyu merged commit 9247aff into ray-project:master

5 checks passed

justinvyu deleted the xgboost/merge_v2 branch

March 11, 2025 18:09

hongpeng-guo mentioned this pull request

[Train V2] Fold v2.LightGBMTrainer API into the public trainer class as an alternate constructor #51265

Merged

qinyiyan pushed a commit to qinyiyan/ray that referenced this pull request


          [train] Fold v2.XGBoostTrainer API into the public trainer class as…

… an alternate constructor (ray-project#50045)

Currently, the new `XGBoostTrainer` API is only accessible with a
separate import `ray.train.xgboost.v2.XGBoostTrainer`.

To avoid unnecessary import changes, this PR folds the new API, which
accepts new arguments `(train_loop_per_worker, train_loop_config,
xgboost_config)`, into the public `ray.train.xgboost.XGBoostTrainer`
class.

This also makes some changes in the Ray Train v2 `XGBoostTrainer` class
to improve the migration UX, since it does not support the legacy
`XGBoostTrainer` API at all.

---------

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

park12sj pushed a commit to park12sj/ray that referenced this pull request


          [train] Fold v2.XGBoostTrainer API into the public trainer class as…

7b7de3e

… an alternate constructor (ray-project#50045)

Currently, the new `XGBoostTrainer` API is only accessible with a
separate import `ray.train.xgboost.v2.XGBoostTrainer`.

To avoid unnecessary import changes, this PR folds the new API, which
accepts new arguments `(train_loop_per_worker, train_loop_config,
xgboost_config)`, into the public `ray.train.xgboost.XGBoostTrainer`
class.

This also makes some changes in the Ray Train v2 `XGBoostTrainer` class
to improve the migration UX, since it does not support the legacy
`XGBoostTrainer` API at all.

---------

Signed-off-by: Justin Yu <justinvyu@anyscale.com>

dhakshin32 pushed a commit to dhakshin32/ray that referenced this pull request


          [train] Fold v2.XGBoostTrainer API into the public trainer class as…

b1f1887

… an alternate constructor (ray-project#50045)

Currently, the new `XGBoostTrainer` API is only accessible with a
separate import `ray.train.xgboost.v2.XGBoostTrainer`.

To avoid unnecessary import changes, this PR folds the new API, which
accepts new arguments `(train_loop_per_worker, train_loop_config,
xgboost_config)`, into the public `ray.train.xgboost.XGBoostTrainer`
class.

This also makes some changes in the Ray Train v2 `XGBoostTrainer` class
to improve the migration UX, since it does not support the legacy
`XGBoostTrainer` API at all.

---------

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Dhakshin Suriakannu <d_suriakannu@apple.com>

justinvyu added a commit that referenced this pull request


          [Train V2] Fold v2.LightGBMTrainer API into the public trainer clas…

a6e0f80

…s as an alternate constructor (#51265)

This PR is a follow-up of #50045,
which folds the new LightGBM API into
`ray.train.lightgbm.LightGBMTrainer` so that users don't need to change
their imports to use the new custom train function API.

---------

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Co-authored-by: Justin Yu <justinvyu@anyscale.com>

han-steve pushed a commit to han-steve/ray that referenced this pull request


          [Train V2] Fold v2.LightGBMTrainer API into the public trainer clas…

9a37c45

…s as an alternate constructor (ray-project#51265)

This PR is a follow-up of ray-project#50045,
which folds the new LightGBM API into
`ray.train.lightgbm.LightGBMTrainer` so that users don't need to change
their imports to use the new custom train function API.

---------

Signed-off-by: Hongpeng Guo <hpguo@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Co-authored-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Steve Han <stevehan2001@gmail.com>

hainesmichaelc added the community-backlog label

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-backlog go