[WB-7886] Add CatBoost Integration#2975
[WB-7886] Add CatBoost Integration#2975dmitryduev merged 35 commits intowandb:masterfrom ayulockin:ayut-catboost
Conversation
* improved xgboost wandb_callback * log config, better logging of metrics, typo * cleaniness is good * add feature importance plotting * best score + iteration * deprecated callback, docstring, fixes * define_metric, rename, fixes * test added * define metrics fixes * add command to yea
Codecov Report
@@ Coverage Diff @@
## master #2975 +/- ##
==========================================
- Coverage 80.15% 79.95% -0.21%
==========================================
Files 210 213 +3
Lines 27818 27872 +54
==========================================
- Hits 22298 22285 -13
- Misses 5520 5587 +67
Flags with carried forward coverage won't be shown. Click here to find out more.
|
morganmcg1
left a comment
There was a problem hiding this comment.
Look good so far, the new XGB callback code is included here too, that should be kept in a separate PR.
There is a log_function function to be added too right?
| train_pool = Pool(train[features], label=train['label'], cat_features=cat_features) | ||
| test_pool = Pool(test[features], label=test['label'], cat_features=cat_features) | ||
|
|
||
| model = CatBoostRegressor(iterations=100, |
There was a problem hiding this comment.
nit, maybe move the iterations argument to a new line
There was a problem hiding this comment.
👍🏻 In general, please re-format the example code. I'd copy it into a dummy file, the run tox -e format -- dummy.py and copy back into docstrings.
There was a problem hiding this comment.
Thanks @ayulockin! Mostly LGTM, a few minor comments.
- Tested it out using their basic tutorial https://github.com/catboost/tutorials/blob/master/python_tutorial.ipynb -- all works well including things like logging custom metrics)
- Please rm the xgboost stuff from this PR.
- Also, @raubitsj, I saw you were gonna add telemetry changes to the xgboost PR, would you mind doing the same here please?
| train_pool = Pool(train[features], label=train['label'], cat_features=cat_features) | ||
| test_pool = Pool(test[features], label=test['label'], cat_features=cat_features) | ||
|
|
||
| model = CatBoostRegressor(iterations=100, |
There was a problem hiding this comment.
👍🏻 In general, please re-format the example code. I'd copy it into a dummy file, the run tox -e format -- dummy.py and copy back into docstrings.
| - :wandb:runs[0][summary][learn-MultiClass] | ||
| - 0.0 | ||
| - :wandb:runs[0][exitcode]: 0 | ||
|
No newline at end of file |
|
@raubitsj looking at |
Co-authored-by: Dmitry Duev <dmitryduev@users.noreply.github.com>
Co-authored-by: Katia Patkin <87335417+kptkin@users.noreply.github.com>
This is the type of changes needed for telemetry: Let me know if you need help, hopefully this is a clean change to model after. |
Responded at: #2975 (comment) |
|
@raubitsj: I added feature usage tracking to telemetry + testing that with |
|
Thanks @dmitryduev for shipping it. |
Fixes WB-7886
Fixes #965
Description
The PR adds a simple
WandbCallbackfor CatBoost.The PR currently enables:
Testing
yeatest and manually tested.