[MRG] Fix predict_proba not fitted check in SGDClassifier#10961
[MRG] Fix predict_proba not fitted check in SGDClassifier#10961lesteve merged 6 commits intoscikit-learn:masterfrom
Conversation
- Remove not fitted check from predict_proba method of SGDClassifier - Check only while calling predic_proba
jnothman
left a comment
There was a problem hiding this comment.
I think that is the right fix. Please add a non-leash regression test
|
I understand non-regression testing as given on the contributing guidelines but how is a non-leash regression test different from that? |
|
I think @jnothman meant a non-regression test (maybe autocorrect or something?). Great if you understand what it is from the contributing guidelines. Can you add a non-regression test then? |
|
Suggestion for non-regression test (add a test function in from sklearn.linear_model import SGDClassifier
clf = SGDClassifier()
clf.predict_proba
clf.predict_log_proba |
|
@lesteve There already exists a test case for predict_proba method (test_sgd_proba) shouldn't it be modified instead of writing a separate test case? |
|
Not really important, but I would rather put that in a separate test function. It is a bit a special case to just test that you can access the |
-Test if the the predict_proba and predict_log_proba methods can be accessed before fitting -Test if the not fitted check is performed before calling the proba methods
|
|
||
| # Checks if not fitted check is performed while calling | ||
| # the methods | ||
| assert_raises(NotFittedError, clf.predict_proba,[[3,2]]) |
| # is accessible for refrencing before fitting | ||
| # the SGD classifier | ||
| clf = SGDClassifier() | ||
| assert_false(hasattr(clf,"predict_proba")) |
There was a problem hiding this comment.
Please just use bare assert not ... rather than assert_false. Thanks
-Remove extra line -Add space after comma -Use assert instead of asser_false
|
|
||
| # Checks if not fitted check is performed while calling | ||
| # the methods | ||
| assert_raises(NotFittedError, clf.predict_proba, [[3, 2]]) |
There was a problem hiding this comment.
I would expect the NotFittedError case to be already tested in test_common.py so I would remove this from the test.
|
|
||
| for loss in ["log", "modified_huber"]: | ||
| clf = SGDClassifier(loss=loss) | ||
| assert_true(hasattr(clf, "predict_proba")) |
| # is accessible for refrencing before fitting | ||
| # the SGD classifier | ||
| clf = SGDClassifier() | ||
| assert not(hasattr(clf, "predict_proba")) |
There was a problem hiding this comment.
Maybe check this for all the losses that do not support predict_proba, you can use SGD.loss_functions to get all the possible losses I think.
Slight improvement would be to also check the error message:
with pytest.raises(AttributeError, 'probability estimates are not available for loss={!r}'.format(loss)Remove test for NotFittedError
|
I pushed some minor tweaks, I think this can be merged when CIs are green. |
|
Perhaps this needs a changelog entry. |
I am a bit undecided but I would say that this is rather an obscure problem (trying to only access |
Reference Issues/PRs
Fixes #10938
What does this implement/fix? Explain your changes.
Fixes not fitted check in predict_proba method so that it doesn't throw a not fitted error while referencing the method. Checks whether the classifier is fitted or not when the method is called.
Any other comments?