Stop using sklearn's private `_scorer` API for custom metrics in SKLL. #751

desilinguist · 2023-07-07T16:01:56Z

Update scikit-learn to v1.3.0.
Move all custom pre-defined metrics to _PREDEFINED_CUSTOM_METRICS dictionary in metrics.py.
Make a copy of the above dictionary as CUSTOM _METRICS. This dictionary will be the source of all custom metrics: pre-defined and user-defined.
We need two dictionaries because at some points in the code, we need to be able to differentiate between pre-defined custom metrics and user-defined custom metrics.
Remove any use of sklearn.metrics._scorer private API and use sklearn.metrics.get_scorer() and sklearn.metrics.get_scorer_names() instead.
Use the actual metric function when using custom metrics instead of its name. This is the core change that makes everything work since sklearn validates callables but does not validate custom metric name strings.
Update tests to not use _SCORERS and fix a minor bug in another test related to scikit-learn v1.3.0.

This PR closes #748 and closes #750.

To review this PR, please try to create some custom metrics for both the titanic and California examples and try to use them via both the API and the configuration file.

- Move all custom pre-defined metrics to `_PREDEFINED_CUSTOM_METRICS` dictionary in `metrics.py`. - Make a copy of the above dictionary as `CUSTOM _METRICS`. This dictionary will be the source of all custom metrics: pre-defined and user-definqed. - We need two dictionaries because at some points in the code, we need to be able to differentiate between pre-defined custom metrics and user-defined custom metrics. - Remove any use of `sklearn.metrics._scorer` private API and use `sklearn.metrics.get_scorer()` and `sklearn.metrics.get_scorer_names()` instead. - Use the actual metric function when using custom metrics instead of its name. This is the core change that makes everything work since sklearn validates callables but does not validate custom metric name strings.

codecov · 2023-07-07T16:25:19Z

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.05 🎉

Comparison is base (5fab3a5) 95.24% compared to head (075ab79) 95.30%.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #751      +/-   ##
==========================================
+ Coverage   95.24%   95.30%   +0.05%     
==========================================
  Files          29       29              
  Lines        3578     3576       -2     
==========================================
  Hits         3408     3408              
+ Misses        170      168       -2

Impacted Files	Coverage Δ
skll/__init__.py	`100.00% <ø> (ø)`
skll/data/writers.py	`94.11% <ø> (+0.91%)`	⬆️
skll/experiments/__init__.py	`94.66% <100.00%> (-0.03%)`	⬇️
skll/learner/__init__.py	`97.19% <100.00%> (+<0.01%)`	⬆️
skll/learner/utils.py	`93.40% <100.00%> (+0.03%)`	⬆️
skll/metrics.py	`97.16% <100.00%> (+0.08%)`	⬆️

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

Frost45

I would like to suggest a couple of changes to the custom_metrics documentation:

Fix indentation in custom.py. There are a few spaces at the start of every line.
Change log to logs in the config file on the same page. Otherwise, you get a KeyError.

Frost45

Looks great! Tested it out with multiple custom metrics.

skll/metrics.py

mulhod · 2023-07-12T15:11:00Z

Tests passed for me just now. Will review shortly.

skll/__init__.py

Co-authored-by: Matt Mulholland <mulhodm@gmail.com>

desilinguist added 5 commits July 7, 2023 10:20

fix: remove stray ipdb traces.

2b4acf5

test: update tests to not use _SCORERS.

f881121

chore: update scikit-learn to 1.3.0

3b1a910

fix: use correct parameter type in test

373b19b

desilinguist requested review from Frost45, damien2012eng, dblandan, mulhod and tamarl08 July 7, 2023 16:01

Frost45 reviewed Jul 11, 2023

View reviewed changes

fix: indentation & typos in custom_metrics.rst

e2a209d

desilinguist requested a review from Frost45 July 11, 2023 20:18

Frost45 approved these changes Jul 12, 2023

View reviewed changes

skll/metrics.py Show resolved Hide resolved

docs: better explain conflicts for custom metrics

1f2d632

mulhod approved these changes Jul 12, 2023

View reviewed changes

skll/__init__.py Outdated Show resolved Hide resolved

Fix typo in skll/__init__.py.

075ab79

Co-authored-by: Matt Mulholland <mulhodm@gmail.com>

desilinguist merged commit dddaa4e into main Jul 12, 2023

delete-merged-branch bot deleted the 750-overhaul-custom-metrics branch July 12, 2023 17:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Stop using sklearn's private `_scorer` API for custom metrics in SKLL. #751

Stop using sklearn's private `_scorer` API for custom metrics in SKLL. #751

Uh oh!

desilinguist commented Jul 7, 2023 •

edited

Loading

Uh oh!

codecov bot commented Jul 7, 2023 •

edited

Loading

Uh oh!

Frost45 left a comment

Uh oh!

Frost45 left a comment

Uh oh!

Uh oh!

mulhod commented Jul 12, 2023

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Stop using sklearn's private _scorer API for custom metrics in SKLL. #751

Stop using sklearn's private _scorer API for custom metrics in SKLL. #751

Uh oh!

Conversation

desilinguist commented Jul 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jul 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Frost45 left a comment

Choose a reason for hiding this comment

Uh oh!

Frost45 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mulhod commented Jul 12, 2023

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Stop using sklearn's private `_scorer` API for custom metrics in SKLL. #751

Stop using sklearn's private `_scorer` API for custom metrics in SKLL. #751

desilinguist commented Jul 7, 2023 •

edited

Loading

codecov bot commented Jul 7, 2023 •

edited

Loading