Update scikit-learn to 0.24.1 #659

desilinguist · 2021-02-03T22:32:04Z

This PR closes #653.

It pretty much works out of the box except for two changes:

Setting the new keyword argument error_score to "raises" in the GridSearchCV() call made in Learner.train() since we want to raise an exception if there was any problem with fitting the estimator. This change is necessary because the new scikit-learn default is to simply return a nan as the fit score in case of a problem which does not work for us.
LinearRegression models in scikit-learn now support a new keyword argument positive which can be set to True to use Non-negative Least Squares (NNLS) regression. This is probably something we want to enable in SKLL since it could be useful in RSMTool. This required the fix for Learner._check_input_formatting() does not work for dense featuresets #656 which has already been merged.
Add a test for this new non-negative regression.
Other minor changes:
- Since Python 3.6 is so long in the tooth, I have changed the Linux builds (on Travis) to use Python 3.7 and the Windows builds (on Azure) to use Python 3.8. I am using Python 3.9 locally.
- Update both requirements.txt and conda_requirements.txt to use the new version of scikit-learn.

- `GridSearchCV` now does not explicitly raise an error if there was an error in fitting the estimator. Rather, it simply returns `nan` as the score. This is not what we want in SKLL, so we set the `error_score` parameter to `raises` which will behave as expected.

…nto 653-update-sklearn-to-0-24-1

- This requires setting `_use_dense_features` since sklearn requires non-sparse features for this type of model.

codecov · 2021-02-03T22:37:58Z

Codecov Report

Merging #659 (1b4b870) into main (a9321a4) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main     #659   +/-   ##
=======================================
  Coverage   95.09%   95.09%           
=======================================
  Files          27       27           
  Lines        3101     3101           
=======================================
  Hits         2949     2949           
  Misses        152      152

Impacted Files	Coverage Δ
skll/learner/__init__.py	`96.47% <ø> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a9321a4...1b4b870. Read the comment docs.

mulhod

Awesome!

desilinguist added 9 commits February 2, 2021 16:24

Update requirements files to use scikit-learn 0.24.1

1a08b74

Update Travis CI to use Python 3.7 instead of 3.6.

be4f4c3

Update Azure Pipelines to use Python 3.8.

8a9ade3

Merge branch '656-fix-check-input-formatting-for-dense-featuresets' i…

cbeaa8b

…nto 653-update-sklearn-to-0-24-1

Add support for non-negative linear regression

922cc79

- This requires setting `_use_dense_features` since sklearn requires non-sparse features for this type of model.

Add a test for non-negative regression.

c4f9785

Include the jsonlines file.

7aa6469

Try to reduce convergence warnings in tests.

1b4b870

desilinguist requested review from aoifecahill and mulhod February 3, 2021 22:32

aoifecahill approved these changes Feb 4, 2021

View reviewed changes

mulhod approved these changes Feb 4, 2021

View reviewed changes

desilinguist merged commit 07de429 into main Feb 4, 2021

delete-merged-branch bot deleted the 653-update-sklearn-to-0-24-1 branch February 4, 2021 15:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update scikit-learn to 0.24.1 #659

Update scikit-learn to 0.24.1 #659

Uh oh!

desilinguist commented Feb 3, 2021 •

edited

Loading

Uh oh!

codecov bot commented Feb 3, 2021 •

edited

Loading

Uh oh!

mulhod left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Update scikit-learn to 0.24.1 #659

Update scikit-learn to 0.24.1 #659

Uh oh!

Conversation

desilinguist commented Feb 3, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Feb 3, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

mulhod left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

desilinguist commented Feb 3, 2021 •

edited

Loading

codecov bot commented Feb 3, 2021 •

edited

Loading