[WIP] Test determinism of estimators#7270
[WIP] Test determinism of estimators#7270betatim wants to merge 5 commits intoscikit-learn:mainfrom betatim:deterministic
Conversation
Created a test that fits two copies of each estimators and compares their predictions on unseen data.
sklearn/tests/test_common.py
Outdated
| X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.5) | ||
|
|
||
| for name, Estimator in all_estimators(type_filter=['classifier', | ||
| 'regressor']): |
There was a problem hiding this comment.
transformers and unsupervised still to do
sklearn/tests/test_common.py
Outdated
| est1.fit(X_train, y_train) | ||
| est2.fit(X_train, y_train) | ||
|
|
||
| assert_array_almost_equal(est1.predict(X_test), |
There was a problem hiding this comment.
Shouldn't this be assert_array_equal instead of assert_array_almost_equal ?
There was a problem hiding this comment.
This works for classifiers and regressors, for the latter we need almost equal.
There was a problem hiding this comment.
But can we call deterministic a method that does not give always the exact same result?
There was a problem hiding this comment.
on a machine with floating point precision? I would.
|
Note that for CV estimators ( |
Interesting, it seems except for |
|
Do we have good reason to believe there are multiple estimators that should fail such a test in scikit-learn? In terms of the specifics, equivalence of You might also consider whether some datasets are more likely to elicit variation than others. For example, I'd consider using a dataset with a random target so that a meaningful classifier is unlikely learnable. |
| y_train = np.reshape(y_train, (-1, 1)) | ||
| y_test = np.reshape(y_test, (-1, 1)) | ||
|
|
||
| needs_state = 'random_state' in signature(Estimator.__init__).parameters |
There was a problem hiding this comment.
You end up with something like needs_state = 'random_state' in Estimator().get_params() is that really nicer than inspecting the __init__ arguments?
There was a problem hiding this comment.
there's a helper set_random_state in the estimator_checks.
|
I should add: without something like |
Given @TomDLT's comment about the *CV estimators not passing on I like the idea on the random dataset. |
|
The other thing that comes to mind here is that we may have false negatives On 29 August 2016 at 22:05, Tim Head notifications@github.com wrote:
|
Starting with the different solvers that may exists in one estimator. |
|
On 08/29/2016 06:05 AM, Tom Dupré la Tour wrote:
|
|
On 08/29/2016 08:08 AM, Joel Nothman wrote:
If we would "just" describe the legal space of all parameters in a more |
|
IMHO having this test is useful, even if it only covers the default argument case (perfection being the enemy of good or some such).
|
|
It's pretty weird that Regression uses stratified CV...? And certainly |
|
LinearRegressionCV uses stratified? That's a bug. |
|
I think several comments here are red herrings.
Our default Do we want to continue on with this? Marking as Stalled. |
@jnothman: I am interested to carry on if this is relevant. |
Reference Issue
Fixes #7139
What does this implement/fix? Explain your changes.
Fit two instances of the same estimator on a toy problem. Use both to predict on an unseen subset of data and compare predictions. If an estimator has a
random_stateargument provide it, if not then not.Any other comments?
There are a few estimators that I am skipping at the moment because they fail the test. Most blatantly are not deterministic but two or so have a different error.
Unsure about the location, should this be a check in
utils/estimator_checks.pyinstead?check_deterministicinutils/estimator_checks.pyHuberRegressor,LogisticRegressionCV,LinearRegression,RANSACRegressorfail?RadiusNeighborsClassifierfail? Failure mode is different to the above