Skip to content

[WB-7886] Add CatBoost Integration#2975

Merged
dmitryduev merged 35 commits intowandb:masterfrom
ayulockin:ayut-catboost
Feb 17, 2022
Merged

[WB-7886] Add CatBoost Integration#2975
dmitryduev merged 35 commits intowandb:masterfrom
ayulockin:ayut-catboost

Conversation

@ayulockin
Copy link
Copy Markdown
Member

@ayulockin ayulockin commented Dec 1, 2021

Fixes WB-7886
Fixes #965

Description

The PR adds a simple WandbCallback for CatBoost.

The PR currently enables:

  • logging iteration
  • logging train and validation metrics

Testing

yea test and manually tested.

ayulockin and others added 6 commits November 17, 2021 23:01
* improved xgboost wandb_callback

* log config, better logging of metrics, typo

* cleaniness is good

* add feature importance plotting

* best score + iteration

* deprecated callback, docstring, fixes

* define_metric, rename, fixes

* test added

* define metrics fixes

* add command to yea
@ayulockin ayulockin requested a review from morganmcg1 December 1, 2021 22:34
@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 1, 2021

Codecov Report

Merging #2975 (37cc0de) into master (55b885c) will decrease coverage by 0.20%.
The diff coverage is 92.18%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2975      +/-   ##
==========================================
- Coverage   80.15%   79.95%   -0.21%     
==========================================
  Files         210      213       +3     
  Lines       27818    27872      +54     
==========================================
- Hits        22298    22285      -13     
- Misses       5520     5587      +67     
Flag Coverage Δ
functest 56.62% <89.06%> (-0.56%) ⬇️
unittest 69.58% <9.37%> (-0.13%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
wandb/sdk/data_types.py 84.34% <ø> (ø)
wandb/integration/catboost/catboost.py 90.56% <90.56%> (ø)
wandb/__init__.py 91.34% <100.00%> (-0.25%) ⬇️
wandb/catboost/__init__.py 100.00% <100.00%> (ø)
wandb/integration/catboost/__init__.py 100.00% <100.00%> (ø)
wandb/integration/lightgbm/__init__.py 95.45% <100.00%> (ø)
wandb/sdk/internal/profiler.py 95.00% <100.00%> (ø)
wandb/sdk/wandb_artifacts.py 81.68% <100.00%> (ø)
wandb/util.py 85.58% <100.00%> (-0.13%) ⬇️
wandb/integration/metaflow/metaflow.py 52.29% <0.00%> (-32.76%) ⬇️
... and 8 more

@dmitryduev dmitryduev self-requested a review December 3, 2021 18:45
Copy link
Copy Markdown
Member

@morganmcg1 morganmcg1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look good so far, the new XGB callback code is included here too, that should be kept in a separate PR.

There is a log_function function to be added too right?

Comment thread wandb/integration/catboost/catboost.py Outdated
train_pool = Pool(train[features], label=train['label'], cat_features=cat_features)
test_pool = Pool(test[features], label=test['label'], cat_features=cat_features)

model = CatBoostRegressor(iterations=100,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, maybe move the iterations argument to a new line

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻 In general, please re-format the example code. I'd copy it into a dummy file, the run tox -e format -- dummy.py and copy back into docstrings.

Comment thread wandb/integration/catboost/catboost.py Outdated
Copy link
Copy Markdown
Member

@dmitryduev dmitryduev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ayulockin! Mostly LGTM, a few minor comments.

Comment thread wandb/integration/catboost/catboost.py Outdated
train_pool = Pool(train[features], label=train['label'], cat_features=cat_features)
test_pool = Pool(test[features], label=test['label'], cat_features=cat_features)

model = CatBoostRegressor(iterations=100,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻 In general, please re-format the example code. I'd copy it into a dummy file, the run tox -e format -- dummy.py and copy back into docstrings.

Comment thread wandb/integration/catboost/catboost.py Outdated
Comment thread wandb/integration/catboost/catboost.py Outdated
Comment thread functional_tests/catboost/catboost.yea Outdated
- :wandb:runs[0][summary][learn-MultiClass]
- 0.0
- :wandb:runs[0][exitcode]: 0

No newline at end of file
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

formatting

Comment thread functional_tests/catboost/test_catboost.py Outdated
@dmitryduev
Copy link
Copy Markdown
Member

@raubitsj looking at wandb/proto/wandb_telemetry.proto, seems like the stuff to enable telemetry for catboost is already there, is that right?

@ayulockin ayulockin marked this pull request as ready for review December 8, 2021 21:32
Comment thread wandb/integration/catboost/catboost.py Outdated
Comment thread wandb/integration/catboost/catboost.py Outdated
Comment thread mypy.ini
Comment thread tests/test_library_public.py Outdated
Comment thread wandb/integration/catboost/catboost.py
Comment thread wandb/integration/catboost/catboost.py
Comment thread wandb/sdk/data_types.py Outdated
Comment thread wandb/integration/catboost/catboost.py
Comment thread wandb/sdk/wandb_artifacts.py
Comment thread wandb/integration/catboost/catboost.py Outdated
dmitryduev and others added 2 commits February 16, 2022 12:54
Co-authored-by: Katia Patkin <87335417+kptkin@users.noreply.github.com>
Comment thread wandb/integration/catboost/catboost.py Outdated
Comment thread functional_tests/catboost/test_catboost.py Outdated
Comment thread wandb/catboost/__init__.py
@raubitsj
Copy link
Copy Markdown
Contributor

Thanks @ayulockin! Mostly LGTM, a few minor comments.

This is the type of changes needed for telemetry:
80afe3f

Let me know if you need help, hopefully this is a clean change to model after.

@raubitsj
Copy link
Copy Markdown
Contributor

@raubitsj looking at wandb/proto/wandb_telemetry.proto, seems like the stuff to enable telemetry for catboost is already there, is that right?

Responded at: #2975 (comment)

@dmitryduev
Copy link
Copy Markdown
Member

@raubitsj: I added feature usage tracking to telemetry + testing that with yea as you requested. Passes locally and the CI is green, so merging.
Many thanks for this once again, @ayulockin!!

@dmitryduev dmitryduev merged commit f82669a into wandb:master Feb 17, 2022
@ayulockin
Copy link
Copy Markdown
Member Author

Thanks @dmitryduev for shipping it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Enhancement] CatBoost support in W&B

5 participants