-
Notifications
You must be signed in to change notification settings - Fork 68
Update scikit-learn to 0.24.1 #659
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- `GridSearchCV` now does not explicitly raise an error if there was an error in fitting the estimator. Rather, it simply returns `nan` as the score. This is not what we want in SKLL, so we set the `error_score` parameter to `raises` which will behave as expected.
…nto 653-update-sklearn-to-0-24-1
- This requires setting `_use_dense_features` since sklearn requires non-sparse features for this type of model.
Codecov Report
@@ Coverage Diff @@
## main #659 +/- ##
=======================================
Coverage 95.09% 95.09%
=======================================
Files 27 27
Lines 3101 3101
=======================================
Hits 2949 2949
Misses 152 152
Continue to review full report at Codecov.
|
mulhod
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome!
This PR closes #653.
It pretty much works out of the box except for two changes:
Setting the new keyword argument
error_scoreto "raises" in the GridSearchCV() call made in Learner.train() since we want to raise an exception if there was any problem with fitting the estimator. This change is necessary because the new scikit-learn default is to simply return a nan as the fit score in case of a problem which does not work for us.LinearRegressionmodels in scikit-learn now support a new keyword argumentpositivewhich can be set toTrueto use Non-negative Least Squares (NNLS) regression. This is probably something we want to enable in SKLL since it could be useful in RSMTool. This required the fix forLearner._check_input_formatting()does not work for dense featuresets #656 which has already been merged.Add a test for this new non-negative regression.
Other minor changes:
Since Python 3.6 is so long in the tooth, I have changed the Linux builds (on Travis) to use Python 3.7 and the Windows builds (on Azure) to use Python 3.8. I am using Python 3.9 locally.
Update both
requirements.txtandconda_requirements.txtto use the new version of scikit-learn.