Skip to content

Conversation

@desilinguist
Copy link
Collaborator

@desilinguist desilinguist commented Jul 7, 2023

  • Update scikit-learn to v1.3.0.
  • Move all custom pre-defined metrics to _PREDEFINED_CUSTOM_METRICS dictionary in metrics.py.
  • Make a copy of the above dictionary as CUSTOM _METRICS. This dictionary will be the source of all custom metrics: pre-defined and user-defined.
  • We need two dictionaries because at some points in the code, we need to be able to differentiate between pre-defined custom metrics and user-defined custom metrics.
  • Remove any use of sklearn.metrics._scorer private API and use sklearn.metrics.get_scorer() and sklearn.metrics.get_scorer_names() instead.
  • Use the actual metric function when using custom metrics instead of its name. This is the core change that makes everything work since sklearn validates callables but does not validate custom metric name strings.
  • Update tests to not use _SCORERS and fix a minor bug in another test related to scikit-learn v1.3.0.

This PR closes #748 and closes #750.

To review this PR, please try to create some custom metrics for both the titanic and California examples and try to use them via both the API and the configuration file.

- Move all custom pre-defined metrics to `_PREDEFINED_CUSTOM_METRICS` dictionary in `metrics.py`.
- Make a copy of the above dictionary as `CUSTOM _METRICS`. This dictionary will be the
  source of all custom metrics: pre-defined and user-definqed.
- We need two dictionaries because at some points in the code, we need to be able to
  differentiate between pre-defined custom metrics and user-defined custom metrics.
- Remove any use of `sklearn.metrics._scorer` private API and use `sklearn.metrics.get_scorer()`
  and `sklearn.metrics.get_scorer_names()` instead.
- Use the actual metric function when using custom metrics instead of its name. This is the core
  change that makes everything work since sklearn validates callables but does not validate
  custom metric name strings.
@codecov
Copy link

codecov bot commented Jul 7, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.05 🎉

Comparison is base (5fab3a5) 95.24% compared to head (075ab79) 95.30%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #751      +/-   ##
==========================================
+ Coverage   95.24%   95.30%   +0.05%     
==========================================
  Files          29       29              
  Lines        3578     3576       -2     
==========================================
  Hits         3408     3408              
+ Misses        170      168       -2     
Impacted Files Coverage Δ
skll/__init__.py 100.00% <ø> (ø)
skll/data/writers.py 94.11% <ø> (+0.91%) ⬆️
skll/experiments/__init__.py 94.66% <100.00%> (-0.03%) ⬇️
skll/learner/__init__.py 97.19% <100.00%> (+<0.01%) ⬆️
skll/learner/utils.py 93.40% <100.00%> (+0.03%) ⬆️
skll/metrics.py 97.16% <100.00%> (+0.08%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Contributor

@Frost45 Frost45 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to suggest a couple of changes to the custom_metrics documentation:

  1. Fix indentation in custom.py. There are a few spaces at the start of every line.
  2. Change log to logs in the config file on the same page. Otherwise, you get a KeyError.

@desilinguist desilinguist requested a review from Frost45 July 11, 2023 20:18
Copy link
Contributor

@Frost45 Frost45 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Tested it out with multiple custom metrics.

@mulhod
Copy link
Contributor

mulhod commented Jul 12, 2023

Tests passed for me just now. Will review shortly.

Co-authored-by: Matt Mulholland <mulhodm@gmail.com>
@desilinguist desilinguist merged commit dddaa4e into main Jul 12, 2023
@delete-merged-branch delete-merged-branch bot deleted the 750-overhaul-custom-metrics branch July 12, 2023 17:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Overhaul how custom metrics are used in SKLL Update scikit-learn dependency to v1.3.0

4 participants