Replace boston in ensemble test_forest by lucyleeow · Pull Request #16927 · scikit-learn/scikit-learn

lucyleeow · 2020-04-15T15:09:56Z

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Replaces boston dataset with ~~subset of california housing~~ diabetes dataset in sklearn/ensemble/tests/test_forest.py

Any other comments?

~~Did not use diabetes dataset due to poor R2 score and oob score in test_oob_score_regressors (as picked by @adrinjalali in prev PR).~~
Poor R2 score in test_oob_score_regressors with diabetes dataset. Happy to change to California/another dataset if this is a problem.

lucyleeow · 2020-04-15T16:45:00Z

I think the failures are due to the use of the california dataset, this is a message from one of the failures:

Downloading Cal. housing from https://ndownloader.figshare.com/files/5976036 to C:\Users\VssAdministrator\scikit_learn_data
Downloading Cal. housing from https://ndownloader.figshare.com/files/5976036 to C:\Users\VssAdministrator\scikit_learn_data
WARNING: Failed to generate report: No data to report.

Link to line: https://dev.azure.com/scikit-learn/scikit-learn/_build/results?buildId=15400&view=logs&j=d32b16b6-cb9d-571b-e765-de83708fb8dd&t=b93f76c1-c2c9-579e-c2ec-c4f438af1261&l=34

But other tests use the california dataset as well so I don't understand the cause of the failure...

cmarmo · 2020-04-15T21:26:40Z

Hi @lucyleeow the failing test seems to be related to pytest-dev/pytest#6925, as only CI with pytest 5.4.1 are failing.
@thomasjpfan, I think you are quite familiar with dependency issues... do you have any suggestion? Thanks a lot!

lucyleeow · 2020-04-16T08:24:16Z

It might be because i didn't add @pytest.mark.network. @thomasjpfan mentioned this in a different PR. I will rework this to use diabetes dataset and if we aren't happy about the score in test_oob_score_regressors I can amend just that test to use california.

thomasjpfan · 2020-04-16T20:11:34Z

While reviewing your PRs, this dataset looks have pretty poor results on test sets compared to the training set, which means the default parameters are overfitting:

from sklearn.datasets import load_diabetes
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import cross_validate

X, y = load_diabetes(return_X_y=True)
results = cross_validate(RandomForestRegressor(random_state=0), X, y,return_train_score=True)

print("train score", results['train_score'].mean())
print("test score", results['test_score'].mean())
# train score 0.9210138461843774
# test score 0.4230661480472566

lucyleeow · 2020-04-16T20:18:46Z

Oh wow that's a big difference. What kind of tests should I be careful on? I might not be good at assessing this e.g., check_diabetes_criterion doesn't split the data. I think it checks if the score is comparable for max_features=None and max_features=6 and the criteria "mse", "mae" and "friedman_mse". Will the overfitting make it unsuitable for this?

Edit: should I tune parameters and use tuned parameters for all the tests?

ogrisel · 2020-04-21T17:38:19Z

I would try to to use min_samples_leaf=10 or more to limit overfitting.

ogrisel · 2020-04-21T17:45:53Z

Hum maybe diabetes is too hard of a dataset to expect good generalization accuracy from the RandomForestRegressor model. Maybe you should try with an easier dataset for which is possible to get higher test scores.

ogrisel · 2020-04-21T17:47:25Z

Or maybe this is good enough for such tests. It's still significantly better than random.

ogrisel · 2020-04-22T08:18:55Z

Or feel free to use make_regression to generate an easy regression problem for the tests as discussed in #16918.

lucyleeow · 2020-04-22T12:57:20Z

I tried using make_regression dataset but for test_oob_score_regressors,
assert test_score > est.oob_score_ doesn't pass.

Tried make_regression(n_samples=500, n_features=10) and a range of values for n_informative and noise.

Is this a problem with the way the dataset is generated? Since test_score should always (?) be higher than est.oob_score_, as oob_score_ uses unseen data.

from sklearn.datasets import make_regression
from sklearn.ensemble import RandomForestRegressor

X_reg, y_reg = make_regression(n_samples=500, n_features=10, n_informative=100,
                               random_state=1)

est = RandomForestRegressor(oob_score=True, random_state=0,
                            n_estimators=50, bootstrap=True)
n_samples = X_reg.shape[0]
est.fit(X_reg[:n_samples // 2, :], y_reg[:n_samples // 2])
test_score = est.score(X_reg[n_samples // 2:, :], y_reg[n_samples // 2:])

print(test_score)
print(est.oob_score_)

Gives:

0.7351186379920649
0.7425675493740419

Regardles, happy to keep as diabetes as well.

glemaitre · 2020-05-18T13:50:43Z

I think that one expects the OOB score to be really close to the score that you will obtain on the test set. So for this test, I would make the diff between the OOB and the test and check that it is smaller than 0.05 for instance.

I am really surprised for the diabetes results indeed.

glemaitre · 2020-05-18T13:52:49Z

sklearn/ensemble/tests/test_forest.py

@@ -389,7 +389,7 @@ def check_oob_score(name, X, y, n_estimators=20):
        assert abs(test_score - est.oob_score_) < 0.1


Indeed we are already doing this for the classification. I think it makes sense to do that for the regression.
We only require a comment to mention that in the first case, this is a diff between accuracies and in the second one a diff between R2.

lucyleeow · 2020-05-18T14:41:03Z

I think that one expects the OOB score to be really close to the score that you will obtain on the test set.

I read the code wrong and thought we were fitting and testing on the same data, which is why I thought oob_score would always be worse. It makes sense. Will fix, thanks @glemaitre

lucyleeow · 2020-05-19T14:36:48Z

Thanks @glemaitre. I amended all test to use the generated regression dataset.

glemaitre · 2020-05-20T10:16:45Z

Thanks @lucyleeow

…cikit-learn#16927)

lucyleeow added 3 commits April 15, 2020 15:12

test_diabetes

4868141

diabetes

c466853

california

fc19cb7

github-actions bot added the module:ensemble label Apr 15, 2020

comment about subset

42f5bd2

lucyleeow changed the title ~~Replace boston in ensemble test_forest~~ WIP Replace boston in ensemble test_forest Apr 15, 2020

use diabetes

566afca

lucyleeow changed the title ~~WIP Replace boston in ensemble test_forest~~ Replace boston in ensemble test_forest Apr 16, 2020

lucyleeow added 2 commits April 16, 2020 11:38

use diabetes in missed tests

f6f8cc8

amend score

679afa7

ogrisel approved these changes Apr 21, 2020

View reviewed changes

lucyleeow added 2 commits April 22, 2020 19:59

try make regression

110d8bd

test

d04619b

cmarmo added the Waiting for Reviewer label May 5, 2020

glemaitre reviewed May 18, 2020

View reviewed changes

glemaitre removed the Waiting for Reviewer label May 19, 2020

lucyleeow added 2 commits May 19, 2020 13:44

use make regression

3684063

merge master

5680948

glemaitre merged commit 5dcdd89 into scikit-learn:master May 20, 2020

lucyleeow deleted the test_forest branch May 20, 2020 11:05

viclafargue pushed a commit to viclafargue/scikit-learn that referenced this pull request Jun 26, 2020

MNT/TST Replace boston by synthetic dataset in ensemble test_forest (s…

13ff58d

…cikit-learn#16927)

jayzed82 pushed a commit to jayzed82/scikit-learn that referenced this pull request Oct 22, 2020

MNT/TST Replace boston by synthetic dataset in ensemble test_forest (s…

9399227

…cikit-learn#16927)

		@@ -389,7 +389,7 @@ def check_oob_score(name, X, y, n_estimators=20):
		assert abs(test_score - est.oob_score_) < 0.1

Uh oh!

Conversation

lucyleeow commented Apr 15, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

lucyleeow commented Apr 15, 2020

Uh oh!

cmarmo commented Apr 15, 2020

Uh oh!

lucyleeow commented Apr 16, 2020

Uh oh!

thomasjpfan commented Apr 16, 2020

Uh oh!

lucyleeow commented Apr 16, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ogrisel commented Apr 21, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ogrisel commented Apr 21, 2020

Uh oh!

ogrisel commented Apr 21, 2020

Uh oh!

ogrisel commented Apr 22, 2020

Uh oh!

lucyleeow commented Apr 22, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

glemaitre commented May 18, 2020

Uh oh!

glemaitre May 18, 2020

Choose a reason for hiding this comment

Uh oh!

lucyleeow commented May 18, 2020

Uh oh!

lucyleeow commented May 19, 2020

Uh oh!

glemaitre commented May 20, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

lucyleeow commented Apr 15, 2020 •

edited

Loading

lucyleeow commented Apr 16, 2020 •

edited

Loading

ogrisel commented Apr 21, 2020 •

edited

Loading

lucyleeow commented Apr 22, 2020 •

edited

Loading