-
Notifications
You must be signed in to change notification settings - Fork 68
Upgrade scikit-learn to v1.0.1 #702
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Use 3.8 for Linux - Use 3.9 for Windows
- scikit-learn is dropping support for `np.matrix` which is what we get from `todense()`, so we need to use `toarray()` instead.
- The old default `squared_loss` has been deprecated and renamed to `squared_error`.
- Scikit-learn v1.0 will be deprecating the `normalize` attribute for linear models - This attribute is set to `False` by default in most scikit-learn linear models anyway and so no warnings are surfaced in SKLL. - However, for `Lars`, the default value of `normalize` is still set to `True` and so we need to force it to False to avoid deprecation warnings. - This code will actually lead to an execption in the `_create_estimator()` method when the `normalize` attribute doesn't exist, so that will be the perfect reminder to excise this if block entirely when the time comes.
Codecov Report
@@ Coverage Diff @@
## main #702 +/- ##
==========================================
+ Coverage 96.85% 96.89% +0.03%
==========================================
Files 63 63
Lines 9098 9197 +99
==========================================
+ Hits 8812 8911 +99
Misses 286 286
Continue to review full report at Codecov.
|
|
Should the |
Good catch! I will modify it to be the same as the requirements files. |
Frost45
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great 🎉
mulhod
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This PR closes #699.
This change is pretty straightforward but it’s definitely backwards incompatible which we will reflect in the release version when we put together the release.
The change specifically motivated by the upgrade are:
requirements.txtandconda_requirements.txtto point to the latest scikit-learn (v1.0.1) and allow up to v1.0.2.squared_erroras the default value of thelossparameter forRANSACRegressor.toarray()for converting sparse numpy arrays to dense instead oftodense()since the latter returns annp.matrixinstead of annp.ndarray. Scikit-learn is planning to drop support fornp.matrixinputs and is already displayingFutureWarnings.normalizeattribute for linear modelsFalseby default in most scikit-learn linear models anyway and so no warnings are surfaced in SKLL.Lars, the default value ofnormalizeis still set toTrueand so we need to force it to False to avoidFutureWarninginstances.ifstatement added will actually lead to an execption in the_create_estimator()method when thenormalizeattribute doesn't exist, so that will be the perfect reminder to excise it entirely when the time comes.Other changes include: