[MRG] Update default test_size of ShuffleSplit for 0.21 by jeremiedbb · Pull Request #13483 · scikit-learn/scikit-learn

jeremiedbb · 2019-03-20T16:14:42Z

Complement of #13443 in order to clean the deprecations for 0.21

I moved the _split part a this separate PR because the change is not straightforward and needs more work.

I've set test_size default value to None, which will be interpreted as the default value (depending on the estimator) when train_size is also None.

sklearn/model_selection/_split.py

jnothman · 2019-03-20T21:41:01Z

The not doing validation in __init__ is a convention for objects supporting set_params. CV splitters do not, so it doesn't apply here. But I'm not sure if it matters whether we validate then. Certainly can factor the code nicer.

ogrisel

Overall looks good but I think we need new tests in test_train_test_split to cover the new behavior when train_size is provided but test_size is left to it's default value.

    # simple test
    split = train_test_split(X, y, test_size=None, train_size=.5)
    X_train, X_test, y_train, y_test = split
    assert_equal(len(y_test), len(y_train))

should be changed to:

    # simple test
    split = train_test_split(X, y, train_size=.5)
    X_train, X_test, y_train, y_test = split
    assert_equal(len(y_test), len(y_train))

and probably something similar for ShuffleSplit / StratifiedShuffleSplit.

ogrisel · 2019-03-21T15:06:06Z

sklearn/model_selection/_split.py

+    if train_size is not None and train_size_type not in ('i', 'f'):
+        raise ValueError("Invalid value for train_size: {}".format(train_size))
+    if test_size is not None and test_size_type not in ('i', 'f'):
+        raise ValueError("Invalid value for test_size: {}".format(test_size))


Technically it should be a TypeError but I guess it's too late to change now as it was already raising a ValueError for this case in released versions of scikit-learn.

yeah I guess so...

sklearn/model_selection/_split.py

ogrisel

I found another problem:

sklearn/model_selection/_split.py

jeremiedbb · 2019-03-21T16:17:39Z

I added tests for the new default value of test_size for all these classes/functions.

sklearn/model_selection/tests/test_split.py

ogrisel · 2019-03-24T16:19:06Z

@jeremiedbb the circle ci status come from your account instead of scikit-learn's. Therefore I cannot restart those to check if the sphinx-gallery error can go away with a rebuild.

jeremiedbb · 2019-03-25T09:47:35Z

I changed it and reran the CI.

jnothman

Thanks @jeremiedbb!

…rn#13483) completing deprecation

…ikit-learn#13483)" This reverts commit a805e13.

…rn#13483) completing deprecation

jeremiedbb added 2 commits March 20, 2019 17:01

new test train size default

d99f5f3

update tests

3633109

jeremiedbb commented Mar 20, 2019

View reviewed changes

sklearn/model_selection/_split.py Outdated Show resolved Hide resolved

explicit use of default_test_size

f19a172

jeremiedbb mentioned this pull request Mar 20, 2019

[MRG] MNT Clean some deprecation stuff for 0.21 #13443

Merged

jeremiedbb changed the title ~~[WIP] Update default test_size for 0.21~~ [WIP] Update default test_size of ShuffleSplit for 0.21 Mar 20, 2019

remove validation in init

0d3d6a6

jeremiedbb changed the title ~~[WIP] Update default test_size of ShuffleSplit for 0.21~~ [MRG] Update default test_size of ShuffleSplit for 0.21 Mar 21, 2019

ogrisel requested changes Mar 21, 2019

View reviewed changes

nitpick

9773524

ogrisel requested changes Mar 21, 2019

View reviewed changes

sklearn/model_selection/_split.py Show resolved Hide resolved

jeremiedbb added 3 commits March 21, 2019 16:41

add test for train test split new default test size

8a24fac

add test for ShuffleSplit new default test size

78a5aab

add test group shuffle split default test size

300e0fc

ogrisel reviewed Mar 22, 2019

View reviewed changes

sklearn/model_selection/tests/test_split.py Outdated Show resolved Hide resolved

make default test size attribute

d9c4807

ogrisel approved these changes Mar 22, 2019

View reviewed changes

rerun ci

7821f65

jnothman approved these changes Mar 25, 2019

View reviewed changes

jnothman merged commit 358c692 into scikit-learn:master Mar 25, 2019

stsievert mentioned this pull request Mar 31, 2019

Model selection tests are raising ImportError on scikit-learn master dask/dask-ml#488

Closed

xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019

API/MNT Update default test_size of ShuffleSplit for 0.21 (scikit-lea…

a805e13

…rn#13483) completing deprecation

xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019

Revert "API/MNT Update default test_size of ShuffleSplit for 0.21 (sc…

a97b3ee

…ikit-learn#13483)" This reverts commit a805e13.

xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019

Revert "API/MNT Update default test_size of ShuffleSplit for 0.21 (sc…

65bd51a

…ikit-learn#13483)" This reverts commit a805e13.

trevorstephens mentioned this pull request May 25, 2019

Scikit-learn 0.21.1 strange failure using nosetests with import train_test_split #13943

Closed

koenvandevelde pushed a commit to koenvandevelde/scikit-learn that referenced this pull request Jul 12, 2019

API/MNT Update default test_size of ShuffleSplit for 0.21 (scikit-lea…

b90bb13

…rn#13483) completing deprecation

cmarmo mentioned this pull request Jul 16, 2020

[MRG] Fix unset local variable due to missing else #12695

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[MRG] Update default test_size of ShuffleSplit for 0.21#13483

[MRG] Update default test_size of ShuffleSplit for 0.21#13483
jnothman merged 10 commits intoscikit-learn:masterfrom
jeremiedbb:split-train-test-size-default

jeremiedbb commented Mar 20, 2019

Uh oh!

Uh oh!

jnothman commented Mar 20, 2019

Uh oh!

ogrisel left a comment

Uh oh!

ogrisel Mar 21, 2019

Uh oh!

jeremiedbb Mar 21, 2019

Uh oh!

Uh oh!

ogrisel left a comment

Uh oh!

Uh oh!

jeremiedbb commented Mar 21, 2019

Uh oh!

Uh oh!

ogrisel commented Mar 24, 2019 •

edited

Loading

Uh oh!

jeremiedbb commented Mar 25, 2019

Uh oh!

jnothman left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

jeremiedbb commented Mar 20, 2019

Uh oh!

Uh oh!

jnothman commented Mar 20, 2019

Uh oh!

ogrisel left a comment

Choose a reason for hiding this comment

Uh oh!

ogrisel Mar 21, 2019

Choose a reason for hiding this comment

Uh oh!

jeremiedbb Mar 21, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ogrisel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jeremiedbb commented Mar 21, 2019

Uh oh!

Uh oh!

ogrisel commented Mar 24, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jeremiedbb commented Mar 25, 2019

Uh oh!

jnothman left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ogrisel commented Mar 24, 2019 •

edited

Loading