[MRG+1] Changing default model for IterativeImputer to BayesianRidge#13038
Conversation
|
Please update the documentation. |
|
Whoops, forgot. Thanks. |
jnothman
left a comment
There was a problem hiding this comment.
I assume this does not also affect documentation in the user guide.
We probably should have had a test that would have broken with this change, i.e. checking the class of each predictor. We should make sure there is such a test at least for custom predictors to ensure no regressions....
|
Sorry, I don't follow. The test would ensure what is going on with custom predictors? |
|
I meant like #13039, but also for predictor=None |
|
I think you mean this? |
|
Yeah that sort of thing. Historically we should have tested the effect of
sample_posterior is what I really mean... But now it is more
straightforward.
|
glemaitre
left a comment
There was a problem hiding this comment.
I would refactor the test with the previous one. It would require just a single statement more.
Otherwise LGTM
sklearn/tests/test_impute.py
Outdated
| assert len(set(hashes)) == len(hashes) | ||
|
|
||
|
|
||
| def test_iterative_imputer_bayesianridge_default(): |
There was a problem hiding this comment.
Would it be better to just make this case in the previous test
There was a problem hiding this comment.
I meant:
@pytest.mark.parametrize(
"predictor",
[None, DummyRegressor(), BayesianRidge(), ARDRegression(), RidgeCV()]
)
def test_iterative_imputer_predictors(predictor):
rng = np.random.RandomState(0)
n = 100
d = 10
X = sparse_random_matrix(n, d, density=0.10, random_state=rng).toarray()
imputer = IterativeImputer(missing_values=0,
n_iter=1,
predictor=predictor,
random_state=rng)
imputer.fit_transform(X)
# check that types are correct for predictors
hashes = []
for triplet in imputer.imputation_sequence_:
expected_type = type(predictor) if predictor is not None else type(BayesianRidge())
assert isinstance(triplet.predictor, expected_type)
hashes.append(id(triplet.predictor))
# check that each predictor is unique
assert len(set(hashes)) == len(hashes)|
Merging when CI turn green |
|
Thanks for the change |
As discussed in #13026 It turns out having
RidgeCVas the default is problematic for reproducibility. This default should be gentler.Note, this may not pass tests until a more stable example is merged via #13026.
Paging @jnothman