[MRG] Fix selection of solver in ridge_regression when solver=='auto' by btel · Pull Request #13363 · scikit-learn/scikit-learn

btel · 2019-03-01T16:25:10Z

Continues the work started in #13336. ~~Must be reviewed as an addition to #13336 and merged after~~ (merged)

What does this implement/fix? Explain your changes.

The solver that was selected when solver argument of ridge_regression was set to 'auto' was ambiguous and sometimes even incorrect (see #13362). This PR tries to make it more explicit.

Any other comments?

Ping: @agramfort

GaelVaroquaux · 2019-03-01T22:04:14Z

@btel : I've merged the other PR, but now this one has conflicts (I think that it is not a good idea to do one PR on top of another).

btel · 2019-03-01T23:17:39Z

@GaelVaroquaux : indeed, it wasn't a good idea to implement this PR on top of #13336. I forgot that the commits will be squashed at merge time... Lesson learned.

I fixed the PR by replaying the changes on top of current master.

sklearn/linear_model/ridge.py

agramfort · 2019-03-02T17:25:04Z

Can you update what’s new ?

GaelVaroquaux · 2019-03-02T17:30:47Z

sklearn/linear_model/ridge.py

+    if return_intercept and solver != 'sag':
+        warnings.warn("In Ridge, only 'sag' solver can currently fit the "
+                      "intercept. Solver has been "
+                      "automatically changed into 'sag'.")


Is this still true: I though that now ridge supports fitting the intercept via other solvers.

Only the Ridge.fit method supports fitting intercept with other solvers (i.e. cholesky/sparse_cg). This was already the case before and it should be changed when the ridge_regression function is refactored as discussed in #13336.

Should we not just raise an error in this case?

I think an error is better too.
Doesn't 'saga' also support fitting intercept ?

Since #13336 is merged this needs to be update

I agree an error should be raised instead of a warning

GaelVaroquaux · 2019-03-02T17:31:21Z

sklearn/linear_model/ridge.py

+    def _select_auto_mode():
+        if return_intercept:
+            # only sag supports fitting intercept directly
+            return "sag"


This logic duplicates that of line 390.

This line executes when argument solver is set to 'auto', line 390 executes when user sets the argument to any other value (except 'sag'). We need to make difference between these two cases, because in one we generate a warning whereas in the other we don't.

btel · 2019-03-02T23:37:18Z

@agramfort I added a new section in whats new doc.

sklearn/linear_model/ridge.py

sklearn/linear_model/tests/test_ridge.py

jnothman

Thanks @btel

jnothman · 2019-03-12T22:45:40Z

sklearn/linear_model/ridge.py

+            solver = "cholesky"
+
+    if solver not in ('sparse_cg', 'cholesky', 'svd', 'lsqr', 'sag', 'saga'):
+        raise ValueError("Known solver are 'sparse_cg', 'cholesky', 'svd'"


solver -> solvers
Or "solver must be one of ..."

jnothman · 2019-03-12T22:46:27Z

sklearn/linear_model/ridge.py

+    if return_intercept and solver != 'sag':
+        warnings.warn("In Ridge, only 'sag' solver can currently fit the "
+                      "intercept. Solver has been "
+                      "automatically changed into 'sag'.")


Should we not just raise an error in this case?

btel · 2019-03-12T23:58:12Z

Thanks @jnothman. I fixed the error message. I agree that it might be better to raise instead of changing automagically the solver, but it was the old behaviour and I didn't want to change it in this PR. However, if there is a consensus to raise, I will update this PR.

btel · 2019-03-15T13:00:19Z

@GaelVaroquaux @agramfort could you approve/merge?

btel · 2019-03-20T15:00:55Z

@jeremiedbb would you mind reviewing/merging this PR? It's good to merge. I talked to @GaelVaroquaux and he does not have time to have a look at it now.

jeremiedbb · 2019-03-20T15:29:05Z

sklearn/linear_model/tests/test_ridge.py

+
+    for solver in ['sparse_cg', 'cholesky', 'svd', 'lsqr', 'saga']:
+        with pytest.warns(UserWarning) as record:
+            target = ridge_regression(X, y, 1,


you can match the warning message directly

with pytest.warns(UserWarning, match='return_intercept=True is only'): ...

jeremiedbb · 2019-03-20T15:30:05Z

sklearn/linear_model/tests/test_ridge.py

+
+        for r in record:
+            r.message.args[0].startswith("return_intercept=True is only")
+


you forgot an assert here. But you'd not need these lines if you do as above.

I removed these lines

jeremiedbb · 2019-03-20T15:36:31Z

sklearn/linear_model/ridge.py

+        if return_intercept:
+            # only sag supports fitting intercept directly
+            solver = "sag"
+        elif has_sw:


I tought 'saga' also supports fitting intercept

I tried fitting intercept with saga, and I get a warning in test test_ridge_fit_intercept_sparse with solver='saga':

ConvergenceWarning('The max_iter was reached which means the coef_ did not converge')

so I guess it's not supported.

jeremiedbb · 2019-03-20T15:38:05Z

sklearn/linear_model/ridge.py

+            # this should be changed since all solvers support sample_weights
+            solver = "cholesky"
+        elif sparse.issparse(X):
+            solver = "sparse_cg"


I find the sequence of conditions hard to follow. I find the initial pattern easier:

if return_intercept: solver = "sag" elif not sparse.issparse(X) or has_sw: solver = "cholesky" else: solver = "sparse_cg"

If all solvers support sample weight, I'd be in favor of removing the has_sw condition.

ok, I changed the sequence and removed the has_sw option.

jeremiedbb · 2019-03-20T15:40:24Z

sklearn/linear_model/tests/test_ridge.py

+    if return_intercept:
+        coef, intercept = target
+        assert_array_almost_equal(coef, true_coefs, decimal=1)
+        assert_array_almost_equal(intercept, 0, decimal=1)


Please use assert_allclose instead

jeremiedbb · 2019-03-20T15:42:46Z

sklearn/linear_model/ridge.py

+    if return_intercept and solver != 'sag':
+        warnings.warn("In Ridge, only 'sag' solver can currently fit the "
+                      "intercept. Solver has been "
+                      "automatically changed into 'sag'.")


I think an error is better too.
Doesn't 'saga' also support fitting intercept ?

jeremiedbb · 2019-03-20T15:43:58Z

I made a few comments. But I'm sorry I can't merge it because I'm just a contributor :)

btel · 2019-03-21T00:27:27Z

@jeremiedbb I made the changes as you suggested. I still kept the warning, because I don't want to break people's code, when the used return_intercept=True and set a solver to a different value than sag (probably rare case, but still). If you think it's ok, I can can raise instead.

I tried fitting intercept with saga and it did not work (see my comment above).

But I'm sorry I can't merge it because I'm just a contributor :)

Sorry, I am a bit confused with the review process. I meant approving not merging the PR.

jeremiedbb · 2019-03-21T09:34:42Z

sklearn/linear_model/tests/test_ridge.py

+        coef, intercept = target
+        assert_allclose(coef, true_coefs, atol=0.1)
+        assert_allclose(intercept, 0, atol=0.1)
+    else:


is an absolute tol of 0.1 necessary ?

Needs to be atol when comparing to 0 but 0.1 seems big for checking equality

it's true, but the differences in the estimations are around 0.02, so I can change to atol=0.03.

The true coefs are: 1, 2, 0.1, intercept 0

.solver auto, return_intercept=True, dense, no sw: (array([0.98738028, 1.97544885, 0.0983598 ]), 0.019808458760372082) .solver auto, return_intercept=False, dense, with sw: [0.99946651 1.98731769 0.10972413] .solver auto, return_intercept=True, dense, with sw: (array([0.98793847, 1.97696595, 0.1002527 ]), 0.01906343205046055) .solver auto, return_intercept=False, sparse, no sw: [0.99836423 1.98816093 0.10940093] .solver auto, return_intercept=True, sparse, no sw: (array([0.98888637, 1.97631928, 0.10069256]), 0.016659883979934714) .solver auto, return_intercept=False, sparse, with sw: [0.99966452 1.98818802 0.10897127] .solver auto, return_intercept=True, sparse, with sw: (array([0.9893669 , 1.97592915, 0.09956311]), 0.017160723846287827) .solver sparse_cg, return_intercept=False, dense, no sw: [0.9991974 1.98769984 0.10987237] .solver sparse_cg, return_intercept=True, dense, no sw: (array([0.98920646, 1.97558791, 0.09788704]), 0.021340292109204684) .solver sparse_cg, return_intercept=False, dense, with sw: [0.99024222 1.99174497 0.11463053] .solver sparse_cg, return_intercept=True, dense, with sw: (array([0.98761024, 1.97533341, 0.09901903]), 0.01854073215870237) .solver sparse_cg, return_intercept=False, sparse, no sw: [0.9994363 1.98744613 0.1091634 ] .solver sparse_cg, return_intercept=True, sparse, no sw: (array([0.9880755 , 1.97727505, 0.09942038]), 0.01744125549431998) .solver sparse_cg, return_intercept=False, sparse, with sw: [0.99054007 1.99162964 0.1142885 ] .solver sparse_cg, return_intercept=True, sparse, with sw: (array([0.98772017, 1.9773033 , 0.10027304]), 0.017187277743091305) .solver cholesky, return_intercept=False, dense, no sw: [0.99913301 1.98694126 0.10987065] .solver cholesky, return_intercept=True, dense, no sw: (array([0.98777785, 1.97725736, 0.09918002]), 0.019112808226411305) .solver cholesky, return_intercept=False, dense, with sw: [0.99911638 1.98733099 0.10993568] .solver cholesky, return_intercept=True, dense, with sw: (array([0.98614477, 1.97491618, 0.09864355]), 0.01985489984301323) .solver cholesky, return_intercept=False, sparse, no sw: [0.99941728 1.98712034 0.1096048 ] .solver cholesky, return_intercept=True, sparse, no sw: (array([0.98829793, 1.97603815, 0.09797809]), 0.019415378825590024) .solver cholesky, return_intercept=False, sparse, with sw: [0.99934764 1.98748946 0.10933262] .solver cholesky, return_intercept=True, sparse, with sw: (array([0.98731514, 1.97603497, 0.09975062]), 0.018854865586171575) .solver lsqr, return_intercept=False, dense, no sw: [0.99951604 1.98764362 0.10910706] .solver lsqr, return_intercept=True, dense, no sw: (array([0.9870815 , 1.97716204, 0.0995867 ]), 0.015630561450455646) .solver lsqr, return_intercept=False, dense, with sw: [0.99976968 1.98730972 0.10931853] .solver lsqr, return_intercept=True, dense, with sw: (array([0.9884556 , 1.97708318, 0.0994495 ]), 0.020868291821179098) .solver lsqr, return_intercept=False, sparse, no sw: [0.99895738 1.98781999 0.1097974 ] .solver lsqr, return_intercept=True, sparse, no sw: (array([0.98793535, 1.97810114, 0.10162843]), 0.01745574357981361) .solver lsqr, return_intercept=False, sparse, with sw: [0.99888735 1.98760483 0.11008256] .solver lsqr, return_intercept=True, sparse, with sw: (array([0.98908756, 1.97644754, 0.09898548]), 0.0179269121290331) .solver sag, return_intercept=False, dense, no sw: [0.99851148 1.98589838 0.10733857] .solver sag, return_intercept=True, dense, no sw: (array([0.98735441, 1.97573369, 0.09955717]), 0.019699491413496736) .solver sag, return_intercept=False, dense, with sw: [1.00098593 1.98703252 0.10881227] .solver sag, return_intercept=True, dense, with sw: (array([0.98860356, 1.97635123, 0.09842113]), 0.01970872473553323) .solver sag, return_intercept=False, sparse, no sw: [0.99926502 1.9880752 0.11024386] .solver sag, return_intercept=True, sparse, no sw: (array([0.98690006, 1.97720689, 0.09983899]), 0.018324191901867053) .solver sag, return_intercept=False, sparse, with sw: [1.00034609 1.98534772 0.10857017] .solver sag, return_intercept=True, sparse, with sw: (array([0.98836367, 1.97905487, 0.09644623]), 0.018058154068734605) .solver saga, return_intercept=False, dense, no sw: [0.999297 1.98728781 0.10987951] .solver saga, return_intercept=True, dense, no sw: (array([0.98715784, 1.97528129, 0.09813245]), 0.02077689395931017) .solver saga, return_intercept=False, dense, with sw: [0.99869864 1.98810477 0.10949984] .solver saga, return_intercept=True, dense, with sw: (array([0.98897455, 1.97488114, 0.09956767]), 0.021863847921240975) .solver saga, return_intercept=False, sparse, no sw: [0.99885491 1.98692853 0.11079414] .solver saga, return_intercept=True, sparse, no sw: (array([0.98841551, 1.97610063, 0.10014244]), 0.01736938519414459) .solver saga, return_intercept=False, sparse, with sw: [0.99622644 1.99183347 0.10405478] .solver saga, return_intercept=True, sparse, with sw: (array([0.98847833, 1.97703492, 0.09749779]), 0.018143914964244622)

so I can change to atol=0.03

I guess the default tol of 1e-3 might be the reason of this poor comparison. Could you try with a zero tol ?

@btel I don't understand why you cannot use a lower tolerance as you have no noise added to data.

what do you mean? the data are randomly generated, so I don't get exactly the coefficients I put in. I can freeze the seed and test against the coefficients that I get after a test run, but still I might get some small differences between the solvers.

What @agramfort meant is that you could pass a smaller tolerance to ridge_regression (as pushed in 604f7d1). Since your data isn't noisy, the solvers should converge just fine

btel · 2019-03-27T09:01:34Z

@jeremiedbb is it good to approve? Let more know if you have extra comments.

jeremiedbb

I just made one last comment. Otherwise LGTM.

jeremiedbb · 2019-03-27T09:08:46Z

sklearn/linear_model/tests/test_ridge.py

+        coef, intercept = target
+        assert_allclose(coef, true_coefs, atol=0.1)
+        assert_allclose(intercept, 0, atol=0.1)
+    else:


I guess the default tol of 1e-3 might be the reason of this poor comparison. Could you try with a zero tol ?

btel · 2019-04-09T12:33:12Z

@jeremiedbb @jnothman I changed rtol to 0 and fixed the merge conflict. Should be good to merge now.

jnothman · 2019-04-09T12:44:28Z

Formally we need another core dev to review

btel · 2019-04-09T14:42:18Z

@agramfort @GaelVaroquaux would you mind approving?

agramfort · 2019-04-10T16:39:31Z

sklearn/linear_model/tests/test_ridge.py

+
+    # test excludes 'svd' solver because it raises exception for sparse inputs
+
+    X = np.random.rand(1000, 3)


please use a fixed random_state

NicolasHug · 2019-04-10T18:51:52Z

doc/whats_new/v0.21.rst

  deterministic when trained in a multi-class setting on several threads.
  :issue:`13422` by :user:`Clément Doumouro <ClemDoum>`.

+- |Fix| Fixed bug in :func:`linear_model.ridge.ridge_regression` that


This is also a bugfix for RidgeClassifier at least

True, I updated the entry

NicolasHug · 2019-04-10T18:54:04Z

sklearn/linear_model/ridge.py

+    if return_intercept and solver != 'sag':
+        warnings.warn("In Ridge, only 'sag' solver can currently fit the "
+                      "intercept. Solver has been "
+                      "automatically changed into 'sag'.")


Since #13336 is merged this needs to be update

btel · 2019-04-16T23:26:13Z

@NicolasHug for the moment only _BaseRidge estimator implements fit_intercept for sparse_cg. This still does not work for ridge_regression function, which only supports the intercept in solvers that fit it directly (sag). This will be changed in the future, but requires some refactoring, see comment: #13336 (comment)

agramfort · 2019-04-17T12:02:20Z

ok I pushed stricter test @btel

I managed by reducing the tol and the alpha to have a much lower atol that pass tests for me. If CIs are green it's good to go from my end.

NicolasHug · 2019-04-17T12:57:16Z

sklearn/linear_model/tests/test_ridge.py

+    X_testing = arr_type(X)
+
+    alpha, atol, tol = 1e-3, 1e-4, 1e-6
+    target = ridge_regression(X_testing, y, alpha=alpha,


not a big fan of target, which is usually what we use for y. Maybe out (not great either)

changed to out

NicolasHug · 2019-04-17T12:57:31Z

sklearn/linear_model/tests/test_ridge.py

+    X, y = make_regression(n_samples=1000, n_features=2, n_informative=2,
+                           bias=10., random_state=42)
+
+    for solver in ['sparse_cg', 'cholesky', 'svd', 'lsqr', 'saga']:


this could be parametrized but OK

NicolasHug · 2019-04-17T12:58:21Z

sklearn/linear_model/ridge.py

+    if return_intercept and solver != 'sag':
+        warnings.warn("In Ridge, only 'sag' solver can currently fit the "
+                      "intercept. Solver has been "
+                      "automatically changed into 'sag'.")


I agree an error should be raised instead of a warning

NicolasHug · 2019-04-17T13:02:42Z

sklearn/linear_model/tests/test_ridge.py

+    if return_intercept:
+        coef, intercept = target
+        assert_allclose(coef, true_coefs, rtol=0, atol=atol)
+        assert_allclose(intercept, intercept, rtol=0, atol=atol)


Please change the name of the true intercept to true_intercept because this doesn't check anything

True! Thanks!

NicolasHug · 2019-04-17T13:06:13Z

sklearn/linear_model/tests/test_ridge.py

+        coef, intercept = target
+        assert_allclose(coef, true_coefs, atol=0.1)
+        assert_allclose(intercept, 0, atol=0.1)
+    else:


What @agramfort meant is that you could pass a smaller tolerance to ridge_regression (as pushed in 604f7d1). Since your data isn't noisy, the solvers should converge just fine

(covered by test_ridge_regression_check_arguments_validity)

btel · 2019-04-17T20:51:32Z

@NicolasHug I addressed the points that you raised. Thanks for reviewing! I also changed the warning to an exception, since it seems that everyone was in favour of it.

NicolasHug · 2019-04-18T00:29:07Z

Thanks @btel!

jnothman · 2019-04-18T04:56:11Z

Yay! Thanks @btel.

…scikit-learn#13363)

… classes (scikit-learn#13363)" This reverts commit e1f66b8.

…scikit-learn#13363)

btel force-pushed the ridge_regression_fix_auto_mode branch from 0a2cfd7 to 5a53d57 Compare March 1, 2019 23:15

btel changed the title ~~[WIP] Fix selection of solver in ridge_regression when solver=='auto'~~ [MRG] Fix selection of solver in ridge_regression when solver=='auto' Mar 1, 2019

GaelVaroquaux reviewed Mar 2, 2019

View reviewed changes

sklearn/linear_model/ridge.py Show resolved Hide resolved

GaelVaroquaux reviewed Mar 2, 2019

View reviewed changes

sklearn/linear_model/ridge.py Outdated Show resolved Hide resolved

GaelVaroquaux reviewed Mar 2, 2019

View reviewed changes

agramfort reviewed Mar 6, 2019

View reviewed changes

sklearn/linear_model/ridge.py Outdated Show resolved Hide resolved

agramfort reviewed Mar 6, 2019

View reviewed changes

sklearn/linear_model/tests/test_ridge.py Outdated Show resolved Hide resolved

btel mentioned this pull request Mar 6, 2019

return_intercept==True in ridge_regression raises an exception #13362

Closed

btel force-pushed the ridge_regression_fix_auto_mode branch from f17610c to 6ae88b8 Compare March 6, 2019 22:02

jnothman reviewed Mar 12, 2019

View reviewed changes

jnothman approved these changes Mar 13, 2019

View reviewed changes

jeremiedbb reviewed Mar 20, 2019

View reviewed changes

jeremiedbb reviewed Mar 21, 2019

View reviewed changes

jeremiedbb approved these changes Mar 27, 2019

View reviewed changes

jnothman added the Blocker label Apr 6, 2019

jnothman added this to the 0.21 milestone Apr 6, 2019

btel added 3 commits April 9, 2019 14:30

ridge_regression: fix selecting solver in 'auto' mode

6877342

test for return_intercept=True and solver specified

39a387b

fix linting issue

62afd26

btel added 2 commits April 9, 2019 14:32

changed atol to 0.03

90a7e56

added rtol=0

97a5326

btel force-pushed the ridge_regression_fix_auto_mode branch from f74d2f9 to 97a5326 Compare April 9, 2019 12:32

agramfort reviewed Apr 10, 2019

View reviewed changes

NicolasHug reviewed Apr 10, 2019

View reviewed changes

use fixed random_state

0619013

btel added 2 commits April 17, 2019 01:30

update whats new

64d21df

currently -> directly

1d7ca02

btel mentioned this pull request Apr 16, 2019

Sparse Ridge regression with intercept is incorrect #4710

Closed

use more strict test

604f7d1

NicolasHug reviewed Apr 17, 2019

View reviewed changes

btel added 3 commits April 17, 2019 22:35

fix test for true_intercept

79d04d6

changed warning to exception

9e1aefb

remove redundant test

d4cd280

(covered by test_ridge_regression_check_arguments_validity)

updated exception message and added entry in whats new

4ed33b5

NicolasHug approved these changes Apr 18, 2019

View reviewed changes

NicolasHug merged commit 77b73d6 into scikit-learn:master Apr 18, 2019

jeremiedbb pushed a commit to jeremiedbb/scikit-learn that referenced this pull request Apr 25, 2019

[MRG] Fix various solver issues in ridge_regression and Ridge classes (…

876908e

…scikit-learn#13363)

xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019

[MRG] Fix various solver issues in ridge_regression and Ridge classes (…

e1f66b8

…scikit-learn#13363)

xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019

Revert "[MRG] Fix various solver issues in ridge_regression and Ridge…

547e2a6

… classes (scikit-learn#13363)" This reverts commit e1f66b8.

xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019

Revert "[MRG] Fix various solver issues in ridge_regression and Ridge…

708bb09

… classes (scikit-learn#13363)" This reverts commit e1f66b8.

koenvandevelde pushed a commit to koenvandevelde/scikit-learn that referenced this pull request Jul 12, 2019

[MRG] Fix various solver issues in ridge_regression and Ridge classes (…

fe9d68c

…scikit-learn#13363)


		for r in record:
		r.message.args[0].startswith("return_intercept=True is only")


		# test excludes 'svd' solver because it raises exception for sparse inputs

		X = np.random.rand(1000, 3)

Uh oh!

Conversation

btel commented Mar 1, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

GaelVaroquaux commented Mar 1, 2019

Uh oh!

btel commented Mar 1, 2019

Uh oh!

Uh oh!

Uh oh!

agramfort commented Mar 2, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

btel commented Mar 2, 2019

Uh oh!

Uh oh!

Uh oh!

jnothman left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

btel commented Mar 12, 2019

Uh oh!

btel commented Mar 15, 2019

Uh oh!

btel commented Mar 20, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jeremiedbb commented Mar 20, 2019

Uh oh!

btel commented Mar 21, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

btel commented Mar 1, 2019 •

edited

Loading

btel commented Mar 21, 2019 •

edited

Loading