[MRG+1] Instance level common tests#9019
Conversation
|
ping @jnothman @GaelVaroquaux @agramfort @vene ;) |
|
|
||
|
|
||
| def multioutput_estimator_convert_y_2d(name, y): | ||
| def multioutput_estimator_convert_y_2d(estimator, y): |
There was a problem hiding this comment.
this one is to make the other patch simpler...
|
with the tags PR I will simplify test_common a bit and basically just call |
vene
left a comment
There was a problem hiding this comment.
(note to self: got up to line ~650)
| check_no_fit_attributes_set_in_init, | ||
| 'estimator_name', | ||
| NonConformantEstimator) | ||
| check_estimators_unfitted("estimator", CorrectNotFittedErrorClassifier()) |
There was a problem hiding this comment.
merge error: line 190 duplicated here
sklearn/utils/estimator_checks.py
Outdated
| estimator.fit(X, y) | ||
| except ValueError as e: | ||
| if str(e) != errmsg: | ||
| raise ValueError("Estimator {0} raised warning as expected, but " |
There was a problem hiding this comment.
(not this PRs fault, but) ambiguous message here: raised warning or error?
sklearn/tests/test_common.py
Outdated
| yield _named_check(check, name), name, Estimator | ||
| estimator = Estimator() | ||
| # check this on class | ||
| yield check_no_fit_attributes_set_in_init, name, Estimator |
There was a problem hiding this comment.
is this not tested for meta estimators?
There was a problem hiding this comment.
Right now the tests for meta-estimators are not very strong. They will be much stronger with the estimator tags branch merged. I tried to keep this PR small, it doesn't increase test coverage much, but it doesn't decrease it anywhere afaik.
There was a problem hiding this comment.
coverage is a good sanity check for this indeed. Sounds good.
There was a problem hiding this comment.
We should still be using _named_check, no?
There was a problem hiding this comment.
If we just reframe check_no_fit_attributes_set_in_init as check_no_fit_attributes_set_upon_clone, perhaps we can get away with running it on an instance too.
sklearn/utils/estimator_checks.py
Outdated
| @@ -244,11 +253,20 @@ def check_estimator(Estimator): | |||
| Class to check. Estimator is a class object (not an instance). | |||
sklearn/utils/estimator_checks.py
Outdated
| # greater than the number of features. | ||
| # So we impose a smaller number (avoid "auto" mode) | ||
| estimator.set_params(n_components=1) | ||
| estimator.set_params(n_components=8) |
There was a problem hiding this comment.
necessary now that the dictlearning bug is fixed?
There was a problem hiding this comment.
Although 1 is very edge-casey here so I'm not opposed to this change even if not strictly necessary. Maybe 2 or so to shave off a few ms
| X[X < .8] = 0 | ||
| X_csr = sparse.csr_matrix(X) | ||
| y = (4 * rng.rand(40)).astype(np.int) | ||
| y = multioutput_estimator_convert_y_2d(estimator, y) |
There was a problem hiding this comment.
was the absence of this line a bug?
There was a problem hiding this comment.
I think so. Hm also why do we need the scaler special cases down there... huh...
There was a problem hiding this comment.
scaling/standardizing when with_mean=True raises an exception rather than densifying the data. I think the decision was that users would ignore warnings and would blow up the memory...
Really the tags for this estimator will be tricky too. Technically with the default args it does not support sparse inputs.
There was a problem hiding this comment.
yeah I thought we deprecated that behavior when we introduced MaxAbsScaler but that was as an alternative for MinMaxScaler, not StandardScaler, my bad
| X = 3 * rnd.uniform(size=(20, 3)) | ||
| y = X[:, 0].astype(np.int) | ||
| y = multioutput_estimator_convert_y_2d(name, y) | ||
| estimator = Estimator() |
|
i don't have the time to review this now, but in terms of testing rigour,
I'm strongly in support of this concept.
…On 7 Jun 2017 3:44 am, "Vlad Niculae" ***@***.***> wrote:
***@***.**** commented on this pull request.
(note to self: got up to line ~650)
------------------------------
In sklearn/utils/tests/test_estimator_checks.py
<#9019 (comment)>
:
> @@ -204,3 +204,4 @@ def __init__(self):
check_no_fit_attributes_set_in_init,
'estimator_name',
NonConformantEstimator)
+ check_estimators_unfitted("estimator", CorrectNotFittedErrorClassifier())
merge error: line 190 duplicated here
------------------------------
In sklearn/utils/estimator_checks.py
<#9019 (comment)>
:
>
errmsg = "Input contains NaN, infinity or a value too large for " \
"dtype('float64')."
try:
- Estimator().fit(X, y)
+ estimator.fit(X, y)
except ValueError as e:
if str(e) != errmsg:
raise ValueError("Estimator {0} raised warning as expected, but "
(not this PRs fault, but) ambiguous message here: raised *warning* or
*error*?
------------------------------
In sklearn/tests/test_common.py
<#9019 (comment)>
:
> @@ -63,8 +64,12 @@ def test_non_meta_estimators():
continue
if name.startswith("_"):
continue
- for check in _yield_all_checks(name, Estimator):
- yield _named_check(check, name), name, Estimator
+ estimator = Estimator()
+ # check this on class
+ yield check_no_fit_attributes_set_in_init, name, Estimator
is this not tested for meta estimators?
------------------------------
In sklearn/utils/estimator_checks.py
<#9019 (comment)>
:
> @@ -244,11 +253,20 @@ def check_estimator(Estimator):
Class to check. Estimator is a class object (not an instance).
Could be an instance now :)
------------------------------
In sklearn/utils/estimator_checks.py
<#9019 (comment)>
:
> @@ -314,7 +332,7 @@ def set_testing_parameters(estimator):
# of components of the random matrix projection will be probably
# greater than the number of features.
# So we impose a smaller number (avoid "auto" mode)
- estimator.set_params(n_components=1)
+ estimator.set_params(n_components=8)
necessary now that the dictlearning bug is fixed?
------------------------------
In sklearn/utils/estimator_checks.py
<#9019 (comment)>
:
> rng = np.random.RandomState(0)
X = rng.rand(40, 10)
X[X < .8] = 0
X_csr = sparse.csr_matrix(X)
y = (4 * rng.rand(40)).astype(np.int)
+ y = multioutput_estimator_convert_y_2d(estimator, y)
was the absence of this line a bug?
------------------------------
In sklearn/utils/estimator_checks.py
<#9019 (comment)>
:
> # to not check deprecated classes
return
rnd = np.random.RandomState(0)
X = 3 * rnd.uniform(size=(20, 3))
y = X[:, 0].astype(np.int)
- y = multioutput_estimator_convert_y_2d(name, y)
- estimator = Estimator()
not cloned here
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#9019 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz67ocUgo385CCo-FCaOYX3xSQ-NIEks5sBY_7gaJpZM4Nxndh>
.
|
| if args[0] == "self": | ||
| # if_delegate_has_method makes methods into functions | ||
| # with an explicit "self", so need to shift arguments | ||
| args = args[1:] |
There was a problem hiding this comment.
Wow, that was pretty subtle! For reference I made this gist.
There was a problem hiding this comment.
What if some clever ass names the first argument something else than self?
It's bad style, and we would probably complain reviewing that code, so I don't think that we should cater for it.
There was a problem hiding this comment.
Since this only happens for methods that are passed through @if_delegate_has_method, maybe we could make that decorator raise an error unless its first attribute is called "self", just to avoid subtle breakage in the future?
|
Most of this PR consists of changes of the form: into: I checked and we cannot remove the ignore_warnings outright, because cloning a deprecated estimator re-raises the deprecation warning. However, this has the consequence of possibly ignoring more deprecations, because the coverage is wider. This way we might miss, in the future, if we deprecate e.g. StandardScaler or something and it is used in the common tests. |
| except ValueError as e: | ||
| if 'class' not in repr(e): | ||
| print(error_string_fit, Classifier, e) | ||
| print(error_string_fit, classifier, e) |
There was a problem hiding this comment.
Should this print classifier.__class__ or similar, or is this ok?
There was a problem hiding this comment.
I think that's fine, but I can also change it if you like.
There was a problem hiding this comment.
it's fine, printout might be more useful the new way.
|
|
||
| # LassoLars stops early for the default alpha=1.0 the iris dataset. | ||
| if name == 'LassoLars': | ||
| estimator = Estimator(alpha=0.) |
There was a problem hiding this comment.
(not the fault of this PR but) Just curious why can't we check if n_iter_ is at least 0 instead. LassoLars is doing the right thing: on some datasets there is no work to be done.
There was a problem hiding this comment.
feel free to open an issue ;)
| return | ||
|
|
||
| else: | ||
| e = estimator() |
There was a problem hiding this comment.
I assume this logic will move to the caller of this code instead. Nice!
| X, y = make_blobs(n_samples=100, random_state=0, n_features=4, | ||
| centers=centers, cluster_std=1.0, shuffle=True) | ||
| X_test = np.random.randn(20, 2) + 4 | ||
| estimator = Estimator() |
We haven't been consistent with how we ignore warnings in the common tests, and some common test outright ignore all warnings. I think we should not worry about deprecations in the common tests for now. |
|
re:deprecations, sounds good, and we shouldn't let it slow this PR down. Other than the couple of missing |
| for sparse_format in ['csr', 'csc', 'dok', 'lil', 'coo', 'dia', 'bsr']: | ||
| X = X_csr.asformat(sparse_format) | ||
| # catch deprecation warnings | ||
| with ignore_warnings(category=DeprecationWarning): |
There was a problem hiding this comment.
also here you didn't move the ignore_warnings to the top
There was a problem hiding this comment.
yeah, but should be fine, right?
There was a problem hiding this comment.
yep just triggered my mindless pattern-matching review process :P
79bd264 to
c636b20
Compare
|
I think I addressed everything. I renamed the check parameters to |
|
@jnothman this should actually be "easy" to review ;) |
|
I'm happy now. |
jnothman
left a comment
There was a problem hiding this comment.
Is there any difference in runtime due to all the cloning?
You need to add tests to ensure that check_estimator still works on both class and instance.
You should also add a test to check that the object passed into check_estimator is completely unchanged afterwards. This will help to ensure that all current and any future checks include a clone.
I'm sorry if you were hoping for me to give you a green light!
sklearn/tests/test_common.py
Outdated
| yield _named_check(check, name), name, Estimator | ||
| estimator = Estimator() | ||
| # check this on class | ||
| yield check_no_fit_attributes_set_in_init, name, Estimator |
There was a problem hiding this comment.
We should still be using _named_check, no?
sklearn/tests/test_common.py
Outdated
| yield _named_check(check, name), name, Estimator | ||
| estimator = Estimator() | ||
| # check this on class | ||
| yield check_no_fit_attributes_set_in_init, name, Estimator |
There was a problem hiding this comment.
If we just reframe check_no_fit_attributes_set_in_init as check_no_fit_attributes_set_upon_clone, perhaps we can get away with running it on an instance too.
doc/whats_new.rst
Outdated
|
|
||
| - All checks in ``utils.estimator_checks``, in particular | ||
| :func:`utils.estimator_checks.check_estimator` now accept estimator | ||
| instances. Checks other than ``check_estimator`` do not accept |
There was a problem hiding this comment.
This is currently untrue for check_no_fit_attributes_set_in_init.
| name = Estimator.__name__ | ||
| check_parameters_default_constructible(name, Estimator) | ||
| for check in _yield_all_checks(name, Estimator): | ||
| if isinstance(Estimator, type): |
There was a problem hiding this comment.
I'm not entirely certain this is the right check, but my attempts to break it have failed so far.
There was a problem hiding this comment.
https://stackoverflow.com/questions/395735/how-to-check-whether-a-variable-is-a-class-or-not sorry, should have looked at the date...
There was a problem hiding this comment.
According to my knowledge and to a quick Google, this is the right way to do it.
sklearn/utils/estimator_checks.py
Outdated
| def _yield_non_meta_checks(name, Estimator): | ||
| def assert_almost_equal_dense_sparse(x, y, decimal=6, err_msg=''): | ||
| if sparse.issparse(x): | ||
| assert_array_almost_equal(x.data, y.data, |
There was a problem hiding this comment.
(fwiw, assert_allclose may be preferable...)
There was a problem hiding this comment.
name's shorter? (i.e. not obscenely verbose) :D
More seriously, because it allows control in terms of absolute and relative tolerance, which allows you to compare very large numbers, very small numbers, etc., while decimals is irrelevant to the internal number format.
There was a problem hiding this comment.
TBH, I've seen it said somewhere. And I can't recall the arguments put forth, but I like the brief name, and I like that we're able to use rtol when appropriate, and it's good practice to be thinking about whether absolute or relative tolerance is the issue for a particular case.
| # greater than the number of features. | ||
| # So we impose a smaller number (avoid "auto" mode) | ||
| estimator.set_params(n_components=1) | ||
| estimator.set_params(n_components=2) |
There was a problem hiding this comment.
set_testing_params becomes problematic now. We will potentially hide bugs by overriding explicit tests of parameter settings. We should only override these if they are currently at their default values.
Thanks for incidentally changing something in this function so that I was reminded that it was an issue!
There was a problem hiding this comment.
why does it become problematic now, wasn't it problematic to begin with?
There was a problem hiding this comment.
No, because before we were changing defaults from what was most useful for a user to what is most practical in testing. Now if a developer explicitly passes a non-default instantiation, we're ignoring their wishes.
There was a problem hiding this comment.
Do you have a suggestion how to resolve that? I hadn't thought about that but agree
There was a problem hiding this comment.
great point. only override a param if it is different not different from the default?
There was a problem hiding this comment.
Yes, putting this in test_classifiers or whatever might make sense. Hopefully we don't break too many downstream libraries using check_estimator this way.
There was a problem hiding this comment.
or we use an ugly hack, like checking if the estimator comes from sklearn ;) but that wouldn't allow us to specify parameters safely.
There was a problem hiding this comment.
that requires adding a use_fast_parameters to all of the checks, right?
Alternatively, we could get rid of the function, and have each estimator declare its testing parameters.
There was a problem hiding this comment.
What is the conclusion of this discussion (https://github.com/scikit-learn/scikit-learn/pull/9019/files#r120591375)?
sklearn/utils/estimator_checks.py
Outdated
|
|
||
|
|
||
| def check_estimator_sparse_data(name, Estimator): | ||
| def check_estimator_sparse_data(name, estimator_org): |
There was a problem hiding this comment.
As someone who has worked too much on Named Entity Recognition, I can't read "org" as anything but "organisation". Can we use estimator_orig or base_estimator?
There was a problem hiding this comment.
base_estimator means something different to me. I orig is fine for me
sklearn/utils/estimator_checks.py
Outdated
|
|
||
| def check_estimators_partial_fit_n_features(name, Alg): | ||
| @ignore_warnings(category=DeprecationWarning) | ||
| def check_estimators_partial_fit_n_features(name, alg_org): |
sklearn/utils/estimator_checks.py
Outdated
| @ignore_warnings(category=DeprecationWarning) | ||
| def check_no_fit_attributes_set_in_init(name, Estimator): | ||
| """Check that Estimator.__init__ doesn't set trailing-_ attributes.""" | ||
| # STILL ON CLASSES |
There was a problem hiding this comment.
Once the world of scikit-learn <= 0.18 is forgotten, this comment will look pretty obscure. How about "Takes a class rather than an estimator instance as input"
There was a problem hiding this comment.
also it's nice to have fewer comments that yell at the reader 😛
sklearn/utils/estimator_checks.py
Outdated
|
|
||
|
|
||
| def check_parameters_default_constructible(name, Estimator): | ||
| # THIS ONE IS STILL ON CLASSES |
There was a problem hiding this comment.
Should we be doing type(estimator)() instead?
There was a problem hiding this comment.
Well, but I want to support non-default constructible estimators in check_estimator, like GridSearchCV.
There was a problem hiding this comment.
Okay, I guess. At some point in the future, I think we will want a way to specify testing parameters for non-default-constructables.
|
@jnothman I was hoping for you to nag as much as possible ;) there are some tests on instances and some on classes but I should make that more explicit. |
|
should be good now. |
|
LGTM. +1 for merge. Does anybody else wants to comment, or should we merge? |
|
I should have either tested locally on old scipy, or you should set up CI for prs onto your fork 😛 in my defence I did check with modern scipy. |
| e = Estimator() | ||
| set_testing_parameters(e) | ||
| def check_estimators_empty_data_messages(name, estimator_orig): | ||
| e = clone(estimator_orig) |
There was a problem hiding this comment.
meh, I just saw that there is also est all over the place. Nevermind.
|
genuine travis failure on py27 ubuntu; RandomForest fails the pickle test it seems |
fix flake8 error and shorten
|
can't reproduce :-/ |
…rn into instance_level_tests
| # greater than the number of features. | ||
| # So we impose a smaller number (avoid "auto" mode) | ||
| estimator.set_params(n_components=1) | ||
| estimator.set_params(n_components=2) |
There was a problem hiding this comment.
What is the conclusion of this discussion (https://github.com/scikit-learn/scikit-learn/pull/9019/files#r120591375)?
| @@ -870,23 +883,22 @@ def check_estimators_nan_inf(name, Estimator): | |||
| for X_train in [X_train_nan, X_train_inf]: | |||
| # catch deprecation warnings | |||
| with ignore_warnings(category=DeprecationWarning): | |||
There was a problem hiding this comment.
This is now caught at the check function level.
|
Congrats, @amueller. I suppose we should start testing some metaestimators
etc
We maybe should update the advice to contrib devs too.
…On 10 Jun 2017 12:34 am, "Andreas Mueller" ***@***.***> wrote:
Merged #9019 <#9019>.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#9019 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz6_HKrxw3j4y3kMbDNMUQmxsSkzpyks5sCVfYgaJpZM4Nxndh>
.
|
* start work on separating instance-level tests * minor refactoring / fixes to work without tags * add clone into check_supervised_y_2d estimator check (which made other checks fail) * remove duplicate check_estimator_unfitted assert * add issue reference to whatsnew entry * added some clones, minor fixes from vene's review * rename estimator arg to estimator_org to make a visible distinction before and after cloning. * more renaming for more explicit clones * org -> orig * allclose, fix orig stuff * don't use set_testing_parameters in the checks! * minor fixes for allclose * fix some test, add more tests on classes * added the test using pickles. * move assert_almost_equal_dense_sparse to utils.testing, rename to assert_allclose_sparse_dense, test it * make assert_allclose_dense_sparse more stringent * more allclose fixes * run test_check_estimator on all estimators * rename set_testing_parameters to set_checking_parameters so nose doesn't think it's a tests (and I don't want to import stuff from nose as we want to remove it) * fix in set_checking_parameters so that common tests pass * more fixes to assert_allclose_dense_sparse * rename alg to clusterer, don't scream even though I really want to * ok this is not a pretty strict test that runs check_estimator with and without fitting on an instance. I also check if ``fit`` is called on the instance that is passed. * simplify test as they didn't help at all * it works!!! omfg * run check_estimator clone test only on one of the configs, don't run locally by default * Add `slow_test` decorator and documentation * run test_check_estimator only on some estimators * fix diags in test for older scipy * fix pep8 and shorten * use joblib.hash for inequality check because the pickle state machine is weird
* start work on separating instance-level tests * minor refactoring / fixes to work without tags * add clone into check_supervised_y_2d estimator check (which made other checks fail) * remove duplicate check_estimator_unfitted assert * add issue reference to whatsnew entry * added some clones, minor fixes from vene's review * rename estimator arg to estimator_org to make a visible distinction before and after cloning. * more renaming for more explicit clones * org -> orig * allclose, fix orig stuff * don't use set_testing_parameters in the checks! * minor fixes for allclose * fix some test, add more tests on classes * added the test using pickles. * move assert_almost_equal_dense_sparse to utils.testing, rename to assert_allclose_sparse_dense, test it * make assert_allclose_dense_sparse more stringent * more allclose fixes * run test_check_estimator on all estimators * rename set_testing_parameters to set_checking_parameters so nose doesn't think it's a tests (and I don't want to import stuff from nose as we want to remove it) * fix in set_checking_parameters so that common tests pass * more fixes to assert_allclose_dense_sparse * rename alg to clusterer, don't scream even though I really want to * ok this is not a pretty strict test that runs check_estimator with and without fitting on an instance. I also check if ``fit`` is called on the instance that is passed. * simplify test as they didn't help at all * it works!!! omfg * run check_estimator clone test only on one of the configs, don't run locally by default * Add `slow_test` decorator and documentation * run test_check_estimator only on some estimators * fix diags in test for older scipy * fix pep8 and shorten * use joblib.hash for inequality check because the pickle state machine is weird
* start work on separating instance-level tests * minor refactoring / fixes to work without tags * add clone into check_supervised_y_2d estimator check (which made other checks fail) * remove duplicate check_estimator_unfitted assert * add issue reference to whatsnew entry * added some clones, minor fixes from vene's review * rename estimator arg to estimator_org to make a visible distinction before and after cloning. * more renaming for more explicit clones * org -> orig * allclose, fix orig stuff * don't use set_testing_parameters in the checks! * minor fixes for allclose * fix some test, add more tests on classes * added the test using pickles. * move assert_almost_equal_dense_sparse to utils.testing, rename to assert_allclose_sparse_dense, test it * make assert_allclose_dense_sparse more stringent * more allclose fixes * run test_check_estimator on all estimators * rename set_testing_parameters to set_checking_parameters so nose doesn't think it's a tests (and I don't want to import stuff from nose as we want to remove it) * fix in set_checking_parameters so that common tests pass * more fixes to assert_allclose_dense_sparse * rename alg to clusterer, don't scream even though I really want to * ok this is not a pretty strict test that runs check_estimator with and without fitting on an instance. I also check if ``fit`` is called on the instance that is passed. * simplify test as they didn't help at all * it works!!! omfg * run check_estimator clone test only on one of the configs, don't run locally by default * Add `slow_test` decorator and documentation * run test_check_estimator only on some estimators * fix diags in test for older scipy * fix pep8 and shorten * use joblib.hash for inequality check because the pickle state machine is weird
* start work on separating instance-level tests * minor refactoring / fixes to work without tags * add clone into check_supervised_y_2d estimator check (which made other checks fail) * remove duplicate check_estimator_unfitted assert * add issue reference to whatsnew entry * added some clones, minor fixes from vene's review * rename estimator arg to estimator_org to make a visible distinction before and after cloning. * more renaming for more explicit clones * org -> orig * allclose, fix orig stuff * don't use set_testing_parameters in the checks! * minor fixes for allclose * fix some test, add more tests on classes * added the test using pickles. * move assert_almost_equal_dense_sparse to utils.testing, rename to assert_allclose_sparse_dense, test it * make assert_allclose_dense_sparse more stringent * more allclose fixes * run test_check_estimator on all estimators * rename set_testing_parameters to set_checking_parameters so nose doesn't think it's a tests (and I don't want to import stuff from nose as we want to remove it) * fix in set_checking_parameters so that common tests pass * more fixes to assert_allclose_dense_sparse * rename alg to clusterer, don't scream even though I really want to * ok this is not a pretty strict test that runs check_estimator with and without fitting on an instance. I also check if ``fit`` is called on the instance that is passed. * simplify test as they didn't help at all * it works!!! omfg * run check_estimator clone test only on one of the configs, don't run locally by default * Add `slow_test` decorator and documentation * run test_check_estimator only on some estimators * fix diags in test for older scipy * fix pep8 and shorten * use joblib.hash for inequality check because the pickle state machine is weird
* start work on separating instance-level tests * minor refactoring / fixes to work without tags * add clone into check_supervised_y_2d estimator check (which made other checks fail) * remove duplicate check_estimator_unfitted assert * add issue reference to whatsnew entry * added some clones, minor fixes from vene's review * rename estimator arg to estimator_org to make a visible distinction before and after cloning. * more renaming for more explicit clones * org -> orig * allclose, fix orig stuff * don't use set_testing_parameters in the checks! * minor fixes for allclose * fix some test, add more tests on classes * added the test using pickles. * move assert_almost_equal_dense_sparse to utils.testing, rename to assert_allclose_sparse_dense, test it * make assert_allclose_dense_sparse more stringent * more allclose fixes * run test_check_estimator on all estimators * rename set_testing_parameters to set_checking_parameters so nose doesn't think it's a tests (and I don't want to import stuff from nose as we want to remove it) * fix in set_checking_parameters so that common tests pass * more fixes to assert_allclose_dense_sparse * rename alg to clusterer, don't scream even though I really want to * ok this is not a pretty strict test that runs check_estimator with and without fitting on an instance. I also check if ``fit`` is called on the instance that is passed. * simplify test as they didn't help at all * it works!!! omfg * run check_estimator clone test only on one of the configs, don't run locally by default * Add `slow_test` decorator and documentation * run test_check_estimator only on some estimators * fix diags in test for older scipy * fix pep8 and shorten * use joblib.hash for inequality check because the pickle state machine is weird
* start work on separating instance-level tests * minor refactoring / fixes to work without tags * add clone into check_supervised_y_2d estimator check (which made other checks fail) * remove duplicate check_estimator_unfitted assert * add issue reference to whatsnew entry * added some clones, minor fixes from vene's review * rename estimator arg to estimator_org to make a visible distinction before and after cloning. * more renaming for more explicit clones * org -> orig * allclose, fix orig stuff * don't use set_testing_parameters in the checks! * minor fixes for allclose * fix some test, add more tests on classes * added the test using pickles. * move assert_almost_equal_dense_sparse to utils.testing, rename to assert_allclose_sparse_dense, test it * make assert_allclose_dense_sparse more stringent * more allclose fixes * run test_check_estimator on all estimators * rename set_testing_parameters to set_checking_parameters so nose doesn't think it's a tests (and I don't want to import stuff from nose as we want to remove it) * fix in set_checking_parameters so that common tests pass * more fixes to assert_allclose_dense_sparse * rename alg to clusterer, don't scream even though I really want to * ok this is not a pretty strict test that runs check_estimator with and without fitting on an instance. I also check if ``fit`` is called on the instance that is passed. * simplify test as they didn't help at all * it works!!! omfg * run check_estimator clone test only on one of the configs, don't run locally by default * Add `slow_test` decorator and documentation * run test_check_estimator only on some estimators * fix diags in test for older scipy * fix pep8 and shorten * use joblib.hash for inequality check because the pickle state machine is weird
* start work on separating instance-level tests * minor refactoring / fixes to work without tags * add clone into check_supervised_y_2d estimator check (which made other checks fail) * remove duplicate check_estimator_unfitted assert * add issue reference to whatsnew entry * added some clones, minor fixes from vene's review * rename estimator arg to estimator_org to make a visible distinction before and after cloning. * more renaming for more explicit clones * org -> orig * allclose, fix orig stuff * don't use set_testing_parameters in the checks! * minor fixes for allclose * fix some test, add more tests on classes * added the test using pickles. * move assert_almost_equal_dense_sparse to utils.testing, rename to assert_allclose_sparse_dense, test it * make assert_allclose_dense_sparse more stringent * more allclose fixes * run test_check_estimator on all estimators * rename set_testing_parameters to set_checking_parameters so nose doesn't think it's a tests (and I don't want to import stuff from nose as we want to remove it) * fix in set_checking_parameters so that common tests pass * more fixes to assert_allclose_dense_sparse * rename alg to clusterer, don't scream even though I really want to * ok this is not a pretty strict test that runs check_estimator with and without fitting on an instance. I also check if ``fit`` is called on the instance that is passed. * simplify test as they didn't help at all * it works!!! omfg * run check_estimator clone test only on one of the configs, don't run locally by default * Add `slow_test` decorator and documentation * run test_check_estimator only on some estimators * fix diags in test for older scipy * fix pep8 and shorten * use joblib.hash for inequality check because the pickle state machine is weird
* start work on separating instance-level tests * minor refactoring / fixes to work without tags * add clone into check_supervised_y_2d estimator check (which made other checks fail) * remove duplicate check_estimator_unfitted assert * add issue reference to whatsnew entry * added some clones, minor fixes from vene's review * rename estimator arg to estimator_org to make a visible distinction before and after cloning. * more renaming for more explicit clones * org -> orig * allclose, fix orig stuff * don't use set_testing_parameters in the checks! * minor fixes for allclose * fix some test, add more tests on classes * added the test using pickles. * move assert_almost_equal_dense_sparse to utils.testing, rename to assert_allclose_sparse_dense, test it * make assert_allclose_dense_sparse more stringent * more allclose fixes * run test_check_estimator on all estimators * rename set_testing_parameters to set_checking_parameters so nose doesn't think it's a tests (and I don't want to import stuff from nose as we want to remove it) * fix in set_checking_parameters so that common tests pass * more fixes to assert_allclose_dense_sparse * rename alg to clusterer, don't scream even though I really want to * ok this is not a pretty strict test that runs check_estimator with and without fitting on an instance. I also check if ``fit`` is called on the instance that is passed. * simplify test as they didn't help at all * it works!!! omfg * run check_estimator clone test only on one of the configs, don't run locally by default * Add `slow_test` decorator and documentation * run test_check_estimator only on some estimators * fix diags in test for older scipy * fix pep8 and shorten * use joblib.hash for inequality check because the pickle state machine is weird

This is pulling out another part of #8022: the instance level common tests.
This doesn't give a huge boon by itself, but allows better testing of meta-estimators, because now check_estimators is much more flexible!
Mainly #8022 is still insanely big and I though it might be a good idea to split it up.