[MRG] Update default test_size of ShuffleSplit for 0.21#13483
[MRG] Update default test_size of ShuffleSplit for 0.21#13483jnothman merged 10 commits intoscikit-learn:masterfrom
Conversation
|
The not doing validation in |
ogrisel
left a comment
There was a problem hiding this comment.
Overall looks good but I think we need new tests in test_train_test_split to cover the new behavior when train_size is provided but test_size is left to it's default value.
# simple test
split = train_test_split(X, y, test_size=None, train_size=.5)
X_train, X_test, y_train, y_test = split
assert_equal(len(y_test), len(y_train))should be changed to:
# simple test
split = train_test_split(X, y, train_size=.5)
X_train, X_test, y_train, y_test = split
assert_equal(len(y_test), len(y_train))and probably something similar for ShuffleSplit / StratifiedShuffleSplit.
| if train_size is not None and train_size_type not in ('i', 'f'): | ||
| raise ValueError("Invalid value for train_size: {}".format(train_size)) | ||
| if test_size is not None and test_size_type not in ('i', 'f'): | ||
| raise ValueError("Invalid value for test_size: {}".format(test_size)) |
There was a problem hiding this comment.
Technically it should be a TypeError but I guess it's too late to change now as it was already raising a ValueError for this case in released versions of scikit-learn.
|
I added tests for the new default value of |
|
@jeremiedbb the circle ci status come from your account instead of scikit-learn's. Therefore I cannot restart those to check if the sphinx-gallery error can go away with a rebuild. |
|
I changed it and reran the CI. |
…rn#13483) completing deprecation
…ikit-learn#13483)" This reverts commit a805e13.
…ikit-learn#13483)" This reverts commit a805e13.
…rn#13483) completing deprecation
Complement of #13443 in order to clean the deprecations for 0.21
I moved the
_splitpart a this separate PR because the change is not straightforward and needs more work.I've set
test_sizedefault value to None, which will be interpreted as the default value (depending on the estimator) whentrain_sizeis also None.