[MRG+1] MAINT Replace manual checks with check_is_fitted #13013
[MRG+1] MAINT Replace manual checks with check_is_fitted #13013jnothman merged 36 commits intoscikit-learn:masterfrom agamemnonc:check_is_fitted_replacements
check_is_fitted #13013Conversation
|
Although there are quite some changes that are PEP8 related and not directly related to this PR, LGTM, if tests pass. |
Thanks. Indeed, as per the description above, I addressed all formatting issues in the modified files so that autopep8/flake8 checks passed before submitting the PR. |
jnothman
left a comment
There was a problem hiding this comment.
The core parts of this PR seem okay, when I can find them
Thanks for reviewing, OK, I could revert the formatting changes if that would be preferred (your first two comments above)? |
|
@agamemnonc Could you revert the style changes. I can make a review then. |
OK, done; @glemaitre please review. Of course, there are now some flake8 warnings on the modified files (mostly due to long lines). |
|
I also think that we should change the common test: scikit-learn/sklearn/utils/estimator_checks.py Lines 1586 to 1616 in fdf2f38 with something like: @ignore_warnings
def check_estimators_unfitted(name, estimator_orig):
"""Check that predict raises an exception in an unfitted estimator.
Unfitted estimators should raise a NotFittedError.
"""
# Common test for Regressors, Classifiers and Outlier detection estimators
X, y = _boston_subset()
estimator = clone(estimator_orig)
msg = ("{} instance is not fitted yet. Call 'fit' with appropriate "
"arguments".format(estimator.__class__.__name__))
for method in ('decision_function', 'predict', 'predict_proba',
'predict_log_proba'):
if getattr(estimator, method, None) is not None:
assert_raises_regex(NotFittedError, msg,
getattr(estimator, method), X) |
|
@jnothman Do you know why we were checking |
glemaitre
left a comment
There was a problem hiding this comment.
I will check if we don't have redundant tests but you can already address those comments.
|
Regarding the tests, I would propose to change the following: scikit-learn/sklearn/ensemble/tests/test_forest.py Lines 371 to 378 in fdf2f38 @pytest.mark.parametrize('name', FOREST_ESTIMATORS)
def test_unfitted_feature_importances(name):
err_msg = ('This {} instance is not fitted yet. Call 'fit' with appropriate '
'arguments before using this method.'.format(name))
pytest.raises(NotFittedError, match=err_msg):
gettattr(FOREST_ESTIMATORS[name](), 'feature_importances') |
|
Even if we don't touch a public API, I would add an entry in the what's new since we modified the tests and error message. |
This is now fixed. thanks! |
|
You can use |
|
Yes, that's what the code currently looks like: msg = ("{} instance is not fitted yet. Call 'fit' with appropriate "
"arguments".format(estimator.__class__.__name__))
for method in ('decision_function', 'predict', 'predict_proba',
'predict_log_proba'):
if getattr(estimator, method, None) is not None:
assert_raises_regex(NotFittedError, msg,
getattr(estimator, method), X)The problem is I am not too sure how to include the deprecation here, i.e. cover the case where an Attribute or ValueError is raised instead with the previous error message ("fit"), in order to allow that and issue a DeprecationWarning. |
|
Oh I see why you wanted a try except then. Basically use I think something around: with pytest.raises(NotFittedError) as excinfo:
getattr(estimator, method), X)
if not str(excinfo.value) in msg and 'fit' in str.excinfo.value):
# raise deprecation waring
else:
assert 'fit' in str.excinfo.valueI wrote this pretty quickly. That might be buggy |
|
Please resolve conflicts with master. |
jnothman
left a comment
There was a problem hiding this comment.
I appreciate the consistent use of check_is_fitted within the library, but I'm not entirely sure we should be forcing all library developers to use the exact same error message. The default message does not, for instance, mention partial_fit.
|
When you get around to resolving my comments, please also move your change log entry to v0.22.rst as version 0.21 has been released. |
|
@jnothman GaussianProcessRegressor can work without calling WDYT? |
|
I think we have explicitly tried to support GPs without fit. Otherwise you
upset all the bayesians.
|
And do you think that having the tag |
|
@jnothman could you have another look. Apart of the tag I think that the PR is good to be merged. |
NicolasHug
left a comment
There was a problem hiding this comment.
A few niticks.
I think the introduction of the estimator tag is appropriate here.
I'm not pressing "Approve" because TBH I'm not entirely sure what should be done regarding the deprecation, but I'm tending towards LGTM.
|
Thanks @NicolasHug for reviewing. I think I have now addressed the issues you raised, otherwise please let me know. |
|
Thanks @agamemnonc |
Thank you @glemaitre for your input and everyone else for reviewing. |
Reference Issues/PRs
Fixes #12991.
What does this implement/fix? Explain your changes.
Replaces manual checks with
check_is_fittedutility function in various places.Any other comments?
All modified files have been checked with flake8 and autopep8 and any formatting issues have been addressed.