Skip to content

[MRG] MNT Clean some deprecation stuff for 0.21#13443

Merged
ogrisel merged 23 commits intoscikit-learn:masterfrom
jeremiedbb:clean-deprecation-0.21
Mar 21, 2019
Merged

[MRG] MNT Clean some deprecation stuff for 0.21#13443
ogrisel merged 23 commits intoscikit-learn:masterfrom
jeremiedbb:clean-deprecation-0.21

Conversation

@jeremiedbb
Copy link
Copy Markdown
Member

Remove some deprecated params from 0.19
Change the max_iter and tol defaults of sgd (and everything depending on it).


@ignore_warnings(category=(DeprecationWarning, FutureWarning))
def test_tol_and_max_iter_default_values():
# Test that the default values are correctly changed
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this whole test is relevant anymore

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, doesn't look necessary, certainly not if we remove those private attributes

@jnothman
Copy link
Copy Markdown
Member

jnothman commented Mar 13, 2019 via email

@jnothman jnothman added this to the 0.21 milestone Mar 13, 2019
assert Xt.data.min() < 0 and Xt.data.max() > 0
Xt = FeatureHasher(alternate_sign=True, non_negative=True,
input_type="pair").fit_transform(X)
assert Xt.data.min() > 0
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it doesn't hurt to keep this test (but remove the non_negative=False parts), unless you think it's redundant?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test without the non-negative argument is the exact same test as test_hasher_alter_sign.

.. deprecated:: 0.19
Passing 'None' to parameter ``accept_sparse`` in methods is
deprecated in version 0.19 "and will be removed in 0.21. Use
``accept_sparse=False`` instead.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should keep this notice, the parameter is still there after all? The wording is not great, one cannot "remove passing 'None' to parameter accept_sparse"

What happens when one passes accept_sparse=None with this change?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If X is dense, accept_sparse is ignored as it was before.
If X is sparse, accept_sparse=None raises an informative error:
Parameter 'accept_sparse' should be a string, boolean or list of strings. You provided 'accept_sparse=None'
I think the behavior is good.

@jeremiedbb
Copy link
Copy Markdown
Member Author

jeremiedbb commented Mar 14, 2019

The failing tests come from test_estimators with name=PassiveAgressiveClassifier and check=check_class_weight_classifiers. It's due to the change of default for the tol parameter from None to 1e-3. Now it converges too quickly (only ~20 iter although max_iter=1000) and the score is not high enough.

Do I have to add a special case for this estimator and set a smaller tol ?

@jnothman
Copy link
Copy Markdown
Member

Is this a bad default tol for PassiveAggressiveClassifier?

We could just add the poor_score tag to PassiveAggressiveClassifier and perhaps note that this is due to too large a tolerance for now? @amueller, wdyt?

@jeremiedbb
Copy link
Copy Markdown
Member Author

jeremiedbb commented Mar 19, 2019

Ok so it turns out it's not a tol issue. The issue is that due to very noisy data and very small number of samples, the loss is very noisy in the first iterations.

For now I fixed it by setting the n_iter_no_change parameter to 20.
(I also tried to increase the number of samples but I had to go up to 500 and since it affects all the estimators the duration of the test would be significantly longer)

@jeremiedbb
Copy link
Copy Markdown
Member Author

jeremiedbb commented Mar 19, 2019

after discussing with @ogrisel, we thought that the default params of SGD might not be best for very small datasets. We could change n_iter_no_change to "auto", which would be higher for very small datasets. We could instead add another parameter min_iter. What do you think ?

@jeremiedbb
Copy link
Copy Markdown
Member Author

I forgot some stuff in sgd (deprecated loss_function) and in _split (change of test_size default value).
I should have everything now.

Copy link
Copy Markdown
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

min_iter in SGD might be reasonable except that it still might be hard to set a good value that is independent of dataset size. Proposed heuristic for n_iter_no_change='auto'?

The stopping criterion. If it is not None, the iterations will stop
when (loss > previous_loss - tol). Defaults to None.
Defaults to 1e-3 from 0.21.
when (loss > previous_loss - tol).
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be updated to reflect n_iter_no_change

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not essential to this PR

return
# n_iter deprecation, set self._max_iter, self._tol

self._tol = self.tol
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we need these private attributes anymore. I think they were invented for the deprecation

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, I removed them


@ignore_warnings(category=(DeprecationWarning, FutureWarning))
def test_tol_and_max_iter_default_values():
# Test that the default values are correctly changed
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, doesn't look necessary, certainly not if we remove those private attributes

The default will change in version 0.21. It will remain 0.2 only
if ``train_size`` is unspecified, otherwise it will complement
the specified ``train_size``.
complement of the train size.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how you implement this by setting test_size=0.2 by default. We still need a placeholder value that acts like complement or 0.2 when unspecified. We still want to support the case that both train_size and test_size are set explicitly

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a validation function for train_size and test_size: _validate_shuffle_split.
It does what's described. I just added an error raise when both are None for train_test_split.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could be mistaken, but I don't think you're right. We agree on:

ShuffleSplit() => (train=90%, test=10%)
ShuffleSplit(test_size=.2) => (train=80%, test=20%)
ShuffleSplit(test_size=.2, train_size=.5) => (train=50%, test=20%)

But while I think this PR (and IIRC the code before 0.19) does:

ShuffleSplit(train_size=.8) => (train=80%, test=10%)

the documented (and desired) behaviour is:

ShuffleSplit(train_size=.8) => (train=80%, test=20%)

Similarly

ShuffleSplit(train_size=.99) => (train=99%, test=1%)

not an error.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahhh ok I misunderstood the desired behavior.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how we can have a default test_size=0.2 and both

ShuffleSplit(train_size=.5) => (train=50%, test=50%)
ShuffleSplit(train_size=.5, test_size=.2) => (train=50%, test=20%)

there's no way to know that test_size=0.2 has been explicitely set by a user.
What we could do is set the default to None and:

  • set it to 0.2 if train_size is None
  • complement train_size otherwise

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this part to #13483 because it needs more work

Copy link
Copy Markdown
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much, this looks good. Merging.

@ogrisel ogrisel merged commit e574990 into scikit-learn:master Mar 21, 2019
xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019
xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019
xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019
koenvandevelde pushed a commit to koenvandevelde/scikit-learn that referenced this pull request Jul 12, 2019
@jeremiedbb jeremiedbb deleted the clean-deprecation-0.21 branch July 20, 2020 14:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants