[MRG+2] Adding Implementation of SAG - next episode by TomDLT · Pull Request #4738 · scikit-learn/scikit-learn

TomDLT · 2015-05-19T13:09:21Z

I took over the great work of @dsullivan7 in #3814.

I removed the merges with master, squashed all the commits and rebased on master.

dsullivan7 · 2015-05-19T13:19:18Z

Awesome @TomDLT! There was talk of having this implemented as a solver for LogisticRegression and RidgeRegression, have you looked into that at all?

amueller · 2015-05-19T13:32:43Z

travis is still unhappy ;) Thanks for picking this up!

TomDLT · 2015-05-19T14:07:32Z

I rerun the classifier benchmark on two large datasets:
RCV1 and Alpha (cf here)
The plot shows the convergence with log10(|loss - loss_optimal|)

Result on Alpha (500.000 x 500, Dense):

Result on RCV1 (804.414 x 47.152, Sparse):

agramfort · 2015-05-19T14:46:00Z

as discussed I vote for adding 'sag' solver to LogisticRegression and RidgeRegression that would call plain sag_logistic and sag_ridge functions.

amueller · 2015-05-19T15:41:33Z

newtoncg is faster than liblinear? I'm surprised! Anyhow SAG seems to kick ass. I'd be +1 on adding a solver to the classifiers as this seems like a good default.

amueller · 2015-05-19T15:43:05Z

(though i dream of the day where the default LogisticRegression is multinomial, not OvR ;)

TomDLT · 2015-05-19T15:54:32Z

newton-cg is faster than liblinear? I'm surprised!

Actually, newton-cg is not faster.
In previous example, with fit_intercept=True, liblinear and newton-cg does not converge to the same minimum, since liblinear use a regularization on the intercept, whereas newton-cg and SAG don't.

I tried using the same regularization in SAG, and it converges to the same minimum as liblinear.
However, it makes more sense not to regularize the intercept.

Finally, with fit_intercept=False, we see that liblinear is a not slower than newton-cg.

amueller · 2015-05-19T18:05:48Z

Thanks for the explanation.

TomDLT · 2015-05-20T18:06:46Z

I implemented sag_logistic as a solver in LogisticRegression, and changed accordingly some of the tests.

Currently, in order to match LogisticRegression API, and compared to previous SAGClassifier,

eta is forced to 'auto'
we loose the warm_start
we loose parallel processing for multiclass
the behavior with multiclass with class weights is changed (it is now the same as in other LogisticRegression solvers with 'OvR' strategy)

amueller · 2015-05-20T18:26:49Z

why do we lose warm_start?

agramfort · 2015-05-21T07:06:24Z

there is no reason not to support warm_start

I would put all sag related code in 1 file called sag.py (ie no sag_class.py)

agramfort · 2015-05-21T07:06:59Z

travis is not happy

TomDLT · 2015-05-21T07:52:06Z

For warm_start, how should I pass the option without adding parameters to LogisticRegression?

agramfort · 2015-05-21T08:01:02Z

Ah... I thought there was a warm_start param in LR already. Keep it for later then.

agramfort · 2015-05-21T11:01:41Z

ping us when ready to merge.

thx

amueller · 2015-05-21T15:18:10Z

I also thought there was a warm_start for LogisticRegressionCV... hum

amueller · 2015-05-21T15:24:54Z

needs a rebase (probably for whatsnew)

TomDLT · 2015-05-21T15:55:25Z

I implemented sag_ridge as a solver in Ridge, and changed accordingly some of the tests.

Currently, in order to match Ridge API, and compared to previous SAGRegressor,

eta is forced to 'auto'
we loose random_state and warm_start

agramfort · 2015-05-21T17:01:15Z

loosing random_state is probably a bad idea. Any thought? Justify API extension?

amueller · 2015-05-21T18:54:23Z

yeah

amueller · 2015-05-21T18:54:46Z

shouldn't LogisticRegression have a random_state for liblinear? Or is that only for hinge-loss?

agramfort · 2015-05-21T19:05:43Z

Indeed it should

TomDLT · 2015-05-21T19:48:24Z

No, LogisticRegression class has a random_state, and so has sag_logistic.
It is missing in Ridge class, so it is currently missing also in sag_ridge.

TomDLT · 2015-09-09T10:29:12Z

Thanks again for the review!

FYI I am working on a multinomial version of SAG, but it will be in another PR.

agramfort · 2015-09-09T10:30:41Z

LGTM too

amueller · 2015-09-09T19:47:29Z

doc/whats_new.rst

space before "By" ;)

amueller · 2015-09-09T19:48:39Z

@ogrisel we want this in the release, right?

amueller · 2015-09-09T19:49:03Z

Great work everybody :)

ogrisel · 2015-09-09T22:06:27Z

@ogrisel we want this in the release, right?

I am not opposed to have it in :)

ogrisel · 2015-09-10T09:08:28Z

FYI I am working on a multinomial version of SAG, but it will be in another PR.

Would be great to consider adding support for sample_weight to LogisticRegression and LogisticRegressionCV as well while you are at it.

ogrisel · 2015-09-10T09:08:44Z

This PR needs a rebase on top of the current master.

ENH add new parameter random_state in Ridge class ENH add new parameter warm_start in LogisticRegression

…olver

amueller · 2015-09-10T15:45:31Z

I'll rebase, squash and merge in a bit unless anyone complains.

amueller · 2015-09-10T17:28:55Z

Pushed as 94eb619. Thanks for the great work!

agramfort · 2015-09-10T21:13:55Z

🍻 !

ogrisel · 2015-09-11T07:42:39Z

🍻!

dsullivan7 · 2015-09-11T07:47:25Z

Awesome!!

TomDLT · 2015-09-11T08:26:02Z

Nice !

fabianp · 2015-09-11T09:04:52Z

Yeah! @TomDLT deserves extra kudos for patience and perseverance :-)

TomDLT · 2015-09-11T12:17:46Z

Thanks :)

TomDLT force-pushed the sag branch 2 times, most recently from 38af560 to 2e9618b Compare May 21, 2015 09:06

TomDLT force-pushed the sag branch from 2e9618b to 13921bd Compare May 21, 2015 15:54

TomDLT force-pushed the sag branch 4 times, most recently from b37cfcc to a1adf82 Compare May 21, 2015 16:35

TomDLT force-pushed the sag branch from aca934e to 66e4a92 Compare September 9, 2015 15:58

amueller reviewed Sep 9, 2015
View reviewed changes

doc/whats_new.rst Outdated

Copy link
Copy Markdown

Member

amueller Sep 9, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space before "By" ;)

amueller added this to the 0.17 milestone Sep 9, 2015

dsullivan7 and others added 11 commits September 10, 2015 14:46

Adding Implementation of SAG

382f5e7

ENH add sag solver in LogisticRegression and Ridge

2f398d3

ENH add new parameter random_state in Ridge class ENH add new parameter warm_start in LogisticRegression

BENCH add rcv1 benchmark

878a3c2

FIX bug and add threading

8610dbf

ENH add n_iter in logistic regression

b118b50

ENH add check_input parameter

7fc0f58

deprecate 'copy' in logistic_regression_path

c72407d

ENH avoid recomputing max_squared_sum in get_auto_step_size for SAG s…

26de4a0

…olver

ENH add n_iter in ridge

273a6db

refactor sag_logistic and sag_ridge in sag_solver

4eb6374

alpha clarification

48bdd1d

TomDLT mentioned this pull request Jan 16, 2018

[MRG] Adds FutureWarning changing default solver to 'lbfgs' and multi_class to 'multinomial' in LogisticRegression #10001

Closed

Uh oh!

Conversation

TomDLT commented May 19, 2015

Uh oh!

dsullivan7 commented May 19, 2015

Uh oh!

amueller commented May 19, 2015

Uh oh!

TomDLT commented May 19, 2015

Uh oh!

agramfort commented May 19, 2015

Uh oh!

amueller commented May 19, 2015

Uh oh!

amueller commented May 19, 2015

Uh oh!

TomDLT commented May 19, 2015

Uh oh!

amueller commented May 19, 2015

Uh oh!

TomDLT commented May 20, 2015

Uh oh!

amueller commented May 20, 2015

Uh oh!

agramfort commented May 21, 2015

Uh oh!

agramfort commented May 21, 2015

Uh oh!

TomDLT commented May 21, 2015

Uh oh!

agramfort commented May 21, 2015 via email

Uh oh!

agramfort commented May 21, 2015

Uh oh!

amueller commented May 21, 2015

Uh oh!

amueller commented May 21, 2015

Uh oh!

TomDLT commented May 21, 2015

Uh oh!

agramfort commented May 21, 2015 via email

Uh oh!

amueller commented May 21, 2015

Uh oh!

amueller commented May 21, 2015

Uh oh!

agramfort commented May 21, 2015 via email

Uh oh!

TomDLT commented May 21, 2015

Uh oh!

TomDLT commented Sep 9, 2015

Uh oh!

agramfort commented Sep 9, 2015 via email

Uh oh!

amueller Sep 9, 2015

Choose a reason for hiding this comment

Uh oh!

amueller commented Sep 9, 2015

Uh oh!

amueller commented Sep 9, 2015

Uh oh!

ogrisel commented Sep 9, 2015

Uh oh!

ogrisel commented Sep 10, 2015

Uh oh!

ogrisel commented Sep 10, 2015

Uh oh!

amueller commented Sep 10, 2015

Uh oh!

amueller commented Sep 10, 2015

Uh oh!

agramfort commented Sep 10, 2015 via email

Uh oh!

ogrisel commented Sep 11, 2015

Uh oh!

dsullivan7 commented Sep 11, 2015

Uh oh!

TomDLT commented Sep 11, 2015

Uh oh!

fabianp commented Sep 11, 2015