TST check equivalence normalize/StandardScaler and dense/sparse in linear models#17665
Conversation
…> linear_model(normalize=False) gives the same result as linear_model(normalize=True)
…into test_normalize_as_pipeline
…ormalize=False) results are the same for the sparse and the dense data: need to add other linear models as well
…into test_normalize_as_pipeline
…a/scikit-learn into test_normalize_as_pipeline
…into test_normalize_as_pipeline
…_true (failing on arrays are not almost equal
…into test_normalize_as_pipeline
…a/scikit-learn into test_normalize_as_pipeline
…into test_normalize_as_pipeline
…into test_normalize_as_pipeline
… normalize and sparse matrices
…into test_normalize_as_pipeline
glemaitre
left a comment
There was a problem hiding this comment.
So we are converging :P
Only some style thing. You might want to merge master into your branch to discard the error with circle ci
|
ping @rth You should know about this issue as well. If you want to have a look to merge the PR once the changes will be done. |
…into test_normalize_as_pipeline
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
| model_dense.fit(X, y) | ||
| model_sparse.fit(X_sparse, y) | ||
|
|
||
| assert_allclose(model_sparse[1].coef_, model_dense[1].coef_) |
There was a problem hiding this comment.
Do we want to also check the intercept as mentioned in #3020 (comment)?
|
Also could you please merge master in to fix documentation CI? |
…into test_normalize_as_pipeline
Co-authored-by: Roman Yurchak <rth.yurchak@gmail.com>
…into test_normalize_as_pipeline
…into test_normalize_as_pipeline
…into test_normalize_as_pipeline
| y_pred_sparse = model_sparse.predict(X_sparse) | ||
| assert_allclose(y_pred_dense, y_pred_sparse) | ||
|
|
||
| assert_allclose(model_dense[1].intercept_, model_sparse[1].intercept_) |
There was a problem hiding this comment.
At first we were reapplying the offset on the intercept but it seems that we always do that:
scikit-learn/sklearn/linear_model/_base.py
Lines 243 to 245 in 27cfe14
Then the equivalence between the intercept should strict equality. Does it seem correct to you @agramfort?
glemaitre
left a comment
There was a problem hiding this comment.
I think the tests look good (only one change). @agramfort @rth if you can have a look at the intercept and potentially merge the PR, it could be great.
|
And codecov is reporting bullshit :) |
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
…into test_normalize_as_pipeline
|
Thanks @maikia now you are responsible for all the tests in |
…near models (scikit-learn#17665) Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org> Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: Roman Yurchak <rth.yurchak@gmail.com>
…near models (scikit-learn#17665) Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org> Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: Roman Yurchak <rth.yurchak@gmail.com>
towards: #3020
In linear_models such as Lasso there is an option to select normalize=True. However, if fit_intercept is set to False this won't have any effect.
Towards depreciating
normalizein linear_models altogether we want to give a user an option to first normalize using StandardScaler and then call the linear_model.This tests make sure that the two options will give the same results and the same .coef_ (event though .intercept_ might differ)
The tests are done for both the sparse and the dense datasets
Reference Issues/PRs
What does this implement/fix? Explain your changes.
Any other comments?