[MRG+2] default gamma='auto' in SVC by gxyd · Pull Request #10331 · scikit-learn/scikit-learn

gxyd · 2017-12-16T09:30:22Z

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Deprecates the default SVC gamma parameter value of "auto", which is calculated as 1 / n_features, and introduces "scale", which is calculated as 1 / (n_features * X.std()).

Any other comments?

I couldn't fix all the doctests since I'm not sure how to run doctests using pytest (I asked on gitter, though haven't received a response), otherwise using make test will more time to see which docs need to be fixed.

Conflicts: sklearn/model_selection/_search.py sklearn/model_selection/tests/test_search.py

gxyd · 2017-12-16T12:18:08Z

Currently it raises an error:

E           AssertionError: Estimator SVC should not change or mutate  the parameter gamma
             from auto_deprecated to auto during fit.

gxyd · 2017-12-16T12:54:14Z

I can fix that issue I mention above (that is not a problem). I'v one query, that should we use gamma='auto' or gamma='scale' in docs? Also a lot of examples in documentation don't actually set random_state (one of the points Raghav raised here #8535 (comment)), do I need to set random_state? Atleast for the example I touch in the documentation.

gxyd · 2017-12-18T05:05:53Z

@amueller can you give a look over the PR, though I've few questions as well that I've mentioned.

gxyd · 2017-12-18T07:00:14Z

Locally tests pass. I am guessing that travis uses scipy version 0.13.3, which doesn't have the power method for lil_matrix, hence the current error on travis. Travis is based on Ubuntu 14.04 (which has scipy 0.13.3 upstream) and that is causing the problem.

gxyd · 2017-12-18T12:30:25Z

sklearn/svm/base.py

-        if self.gamma == 'auto':
+        if self.gamma == 'scale':
+            if isinstance(X, sp.spmatrix):
+                X_std = np.sqrt(X.power(2).mean() - (X.mean())**2)


I've been able to figure out how to do this without the use of X.power.

I could simply use X.toarray()**2, but I don't think that is efficient.

Maybe X.multiply(X)?

Why can there be a lil matrix here? We called check_X_y above with accept_sparse='csr'.

amueller

Can we maybe not deprecate auto and just change the default? not sure, though....

amueller · 2017-12-18T16:22:58Z

sklearn/svm/base.py

-        if self.gamma == 'auto':
+        if self.gamma == 'scale':
+            if isinstance(X, sp.spmatrix):
+                X_std = np.sqrt(X.power(2).mean() - (X.mean())**2)


Why can there be a lil matrix here? We called check_X_y above with accept_sparse='csr'.

gxyd · 2017-12-18T18:41:44Z

@amueller

Can we maybe not deprecate auto and just change the default? not sure, though....

I think that is the intention here. That is to replace the default value not to actually remove 'auto' from possible gamma values.
Does the usage of gamma=auto_deprecate sounds misleading here? Or better simply use auto_default? Though @jnothman suggested that here #8535 (comment) . Also I think the warning message should be made more clear. WDYT?

amueller · 2017-12-18T20:06:35Z

sklearn/svm/classes.py

-        If gamma is 'auto' then 1/n_features will be used instead.
+        If gamma is 'auto' then 1/n_features will be used.
+        If gamma is 'scale' then 1/(n_features * X.std()) will be used.
+        The current default 'auto' is deprecated in version 0.20 and will


I guess the formulation here was what led me to think that 'auto' is deprecated, i.e. will be removed. I would rather say that "The default value will change to 'scale' in 0.22". I'm not sure about a good way to say when this decision was made. I think we had "This warning was introduced in version X" somewhere.

jnothman · 2017-12-18T22:18:15Z

It's tempting to just change the meaning of auto, isn't it? But as it affects the model, we probably shouldn't...

gxyd · 2017-12-19T16:49:22Z

I can't say if it is tempting, since I don't know the reason for making this change. I mean, why is using gamma=1 / (n_features * X.std()) is more apt choice? Can you please refer me to material explaining the reason for why it performs better than gamma=1 / n_features. And if it does lead to improvement in model, then why not? And reading from the discussion on the issue @amueller (at that time) seemed quite convinced of the benefits of this change in gamma value.

amueller · 2017-12-19T19:31:55Z

There's probably no literature on why this would be good. But the 1/n_features heuristic only makes sense when X is scaled. If you scaled X, there will be no change. If you didn't scale X, it'll "fix" that somewhat. Clearly, gamma is inversely related to scaling of the data; if you scale X by 10 you need to scale gamma by 1/10.
Why not change it? Because that'll change the behavior of existing code and people will suddenly get different results.

gxyd · 2017-12-19T19:41:16Z

I had a look over "A Practical Guide to Support Vector Classification" (by Chih-Wei Hsu, Chih-Chung Chang, and Chih-Jen Lin), and it says that

Scaling before applying SVM is very important. Part 2 of Sarle’s Neural Networks
FAQ Sarle (1997) explains the importance of this and most of considerations also ap-
ply to SVM.

So considering that, it would imply that in most of the cases scaling is mostly needed.

Why not change it? Because that'll change the behavior of existing code and people will suddenly get different results.

Yes, I totally understand this part. "Changing" of things is one of the things to upset users.

Though I'll take your final word, if you still consider that the change is worth it.

amueller · 2017-12-19T19:45:35Z

So considering that, it would imply that in most of the cases scaling is mostly needed.

Yes, but people don't do it. Basically the new default helps people who forget to scale their data.

gxyd · 2017-12-19T19:50:10Z

That is right. Should I close the PR then? May be the corresponding issue as well? (I can't close the issue, I didn't open).

amueller · 2017-12-19T20:22:52Z

What? No. Please go ahead with this. I would do the full deprecation cycle as you did.

amueller · 2017-12-19T20:23:16Z

I would just clarify the wording on what exactly changes.

gxyd · 2017-12-20T08:52:21Z

I don't understand what the problem with CircleCI tests are. Can you help me out?

gxyd · 2017-12-20T10:45:36Z

I don't think the error is related to the PR. It raises

CondaValueError: invalid package specification: python=

So possibly something wrong with the build configuration itself.

gxyd · 2017-12-20T19:43:52Z

@amueller any suggestion for the failing tests?

amueller · 2017-12-20T20:01:13Z

Ignore it ;)

…s, fixed the docstring parameter order

neokt and others added 3 commits March 4, 2017 17:59

gamma=auto in SVC scikit-learn#8361

0551ddd

Merge branch 'master' into svc_gamma

3a4a275

Conflicts: sklearn/model_selection/_search.py sklearn/model_selection/tests/test_search.py

fix docs and add gamma='auto_deprecated' as default

913ac83

gxyd force-pushed the svc_gamma branch from 1c9f918 to 913ac83 Compare December 16, 2017 09:44

fix

730776e

revert change to change self.gamma

814c4e6

fix docs

da2bcc5

gxyd commented Dec 18, 2017

View reviewed changes

amueller reviewed Dec 18, 2017

View reviewed changes

gxyd added 2 commits December 18, 2017 22:55

use X.multiply(X) for csr_matrix

61bd7c1

fix pep8

5125779

amueller reviewed Dec 18, 2017

View reviewed changes

use 'auto_default' instead of 'auto_deprecated'

ea19b2d

ivannz added a commit to ivannz/scikit-learn that referenced this pull request Jul 25, 2019

FIX: Updated the default gamma to reflect scikit-learn#10331 and test…

641189b

…s, fixed the docstring parameter order

ivannz added a commit to ivannz/scikit-learn that referenced this pull request Aug 22, 2019

FIX: Updated the default gamma to reflect scikit-learn#10331 and test…

aedb128

…s, fixed the docstring parameter order

ivannz added a commit to ivannz/scikit-learn that referenced this pull request Oct 5, 2019

FIX: Updated the default gamma to reflect scikit-learn#10331 and test…

40a398c

…s, fixed the docstring parameter order

ivannz added a commit to ivannz/scikit-learn that referenced this pull request Oct 11, 2019

FIX: Updated the default gamma to reflect scikit-learn#10331 and test…

2a41fbc

…s, fixed the docstring parameter order

ivannz added a commit to ivannz/scikit-learn that referenced this pull request Nov 1, 2019

FIX: Updated the default gamma to reflect scikit-learn#10331 and test…

b845645

…s, fixed the docstring parameter order

ivannz added a commit to ivannz/scikit-learn that referenced this pull request Jan 10, 2020

FIX: Updated the default gamma to reflect scikit-learn#10331 and test…

23b7efb

…s, fixed the docstring parameter order

ivannz added a commit to ivannz/scikit-learn that referenced this pull request Jan 30, 2020

FIX: Updated the default gamma to reflect scikit-learn#10331 and test…

2213613

…s, fixed the docstring parameter order

ivannz added a commit to ivannz/scikit-learn that referenced this pull request Jan 30, 2020

FIX: Updated the default gamma to reflect scikit-learn#10331 and test…

4a2dd57

…s, fixed the docstring parameter order

ivannz added a commit to ivannz/scikit-learn that referenced this pull request Feb 27, 2020

FIX: Updated the default gamma to reflect scikit-learn#10331 and test…

ef7aa46

…s, fixed the docstring parameter order

ivannz added a commit to ivannz/scikit-learn that referenced this pull request Mar 12, 2020

FIX: Updated the default gamma to reflect scikit-learn#10331 and test…

76cbdd4

…s, fixed the docstring parameter order

ivannz added a commit to ivannz/scikit-learn that referenced this pull request Apr 17, 2020

FIX: Updated the default gamma to reflect scikit-learn#10331 and test…

1d57e62

…s, fixed the docstring parameter order

ivannz added a commit to ivannz/scikit-learn that referenced this pull request May 3, 2020

FIX: Updated the default gamma to reflect scikit-learn#10331 and test…

462c44a

…s, fixed the docstring parameter order

ivannz added a commit to ivannz/scikit-learn that referenced this pull request May 30, 2020

FIX: Updated the default gamma to reflect scikit-learn#10331 and test…

9e50636

…s, fixed the docstring parameter order

ivannz added a commit to ivannz/scikit-learn that referenced this pull request Jul 5, 2020

FIX: Updated the default gamma to reflect scikit-learn#10331 and test…

807b29c

…s, fixed the docstring parameter order

ivannz added a commit to ivannz/scikit-learn that referenced this pull request Jul 5, 2020

FIX: Updated the default gamma to reflect scikit-learn#10331 and test…

b39dd47

…s, fixed the docstring parameter order

ivannz added a commit to ivannz/scikit-learn that referenced this pull request Aug 5, 2020

FIX: Updated the default gamma to reflect scikit-learn#10331 and test…

2516faa

…s, fixed the docstring parameter order

ivannz added a commit to ivannz/scikit-learn that referenced this pull request Oct 15, 2020

FIX: Updated the default gamma to reflect scikit-learn#10331 and test…

7638119

…s, fixed the docstring parameter order

ivannz added a commit to ivannz/scikit-learn that referenced this pull request Nov 28, 2020

FIX: Updated the default gamma to reflect scikit-learn#10331 and test…

1171c95

…s, fixed the docstring parameter order

ivannz added a commit to ivannz/scikit-learn that referenced this pull request Dec 12, 2020

FIX: Updated the default gamma to reflect scikit-learn#10331 and test…

7f70127

…s, fixed the docstring parameter order

ivannz added a commit to ivannz/scikit-learn that referenced this pull request Feb 25, 2021

FIX: Updated the default gamma to reflect scikit-learn#10331 and test…

4416927

…s, fixed the docstring parameter order

ivannz added a commit to ivannz/scikit-learn that referenced this pull request Feb 25, 2021

FIX: Updated the default gamma to reflect scikit-learn#10331 and test…

7248f6f

…s, fixed the docstring parameter order

ivannz added a commit to ivannz/scikit-learn that referenced this pull request Mar 27, 2021

FIX: Updated the default gamma to reflect scikit-learn#10331 and test…

a8d1f69

…s, fixed the docstring parameter order

ivannz added a commit to ivannz/scikit-learn that referenced this pull request Jun 15, 2021

FIX: Updated the default gamma to reflect scikit-learn#10331 and test…

542b5ae

…s, fixed the docstring parameter order

ivannz added a commit to ivannz/scikit-learn that referenced this pull request Jul 23, 2021

FIX: Updated the default gamma to reflect scikit-learn#10331 and test…

c2105c0

…s, fixed the docstring parameter order

ivannz added a commit to ivannz/scikit-learn that referenced this pull request Nov 10, 2021

FIX: Updated the default gamma to reflect scikit-learn#10331 and test…

7fba0a9

…s, fixed the docstring parameter order

iasoon mentioned this pull request Apr 2, 2022

TST remove tests for default change warnings in test_svm.py #23030

Merged

ivannz added a commit to ivannz/scikit-learn that referenced this pull request May 15, 2022

FIX: Updated the default gamma to reflect scikit-learn#10331 and test…

c90cbbc

…s, fixed the docstring parameter order

ivannz added a commit to ivannz/scikit-learn that referenced this pull request Jun 14, 2022

FIX: Updated the default gamma to reflect scikit-learn#10331 and test…

9cdcd16

…s, fixed the docstring parameter order

ivannz added a commit to ivannz/scikit-learn that referenced this pull request Aug 29, 2022

FIX: Updated the default gamma to reflect scikit-learn#10331 and test…

82a367d

…s, fixed the docstring parameter order

ivannz added a commit to ivannz/scikit-learn that referenced this pull request Sep 5, 2022

FIX: Updated the default gamma to reflect scikit-learn#10331 and test…

5472505

…s, fixed the docstring parameter order

Uh oh!

Conversation

gxyd commented Dec 16, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

gxyd commented Dec 16, 2017

Uh oh!

gxyd commented Dec 16, 2017

Uh oh!

gxyd commented Dec 18, 2017

Uh oh!

gxyd commented Dec 18, 2017

Uh oh!

gxyd Dec 18, 2017

Choose a reason for hiding this comment

Uh oh!

gxyd Dec 18, 2017

Choose a reason for hiding this comment

Uh oh!

lesteve Dec 18, 2017

Choose a reason for hiding this comment

Uh oh!

amueller Dec 18, 2017

Choose a reason for hiding this comment

Uh oh!

amueller left a comment

Choose a reason for hiding this comment

Uh oh!

amueller Dec 18, 2017

Choose a reason for hiding this comment

Uh oh!

gxyd commented Dec 18, 2017

Uh oh!

amueller Dec 18, 2017

Choose a reason for hiding this comment

Uh oh!

jnothman commented Dec 18, 2017 via email

Uh oh!

gxyd commented Dec 19, 2017 via email

Uh oh!

amueller commented Dec 19, 2017

Uh oh!

gxyd commented Dec 19, 2017

Uh oh!

amueller commented Dec 19, 2017

Uh oh!

gxyd commented Dec 19, 2017

Uh oh!

amueller commented Dec 19, 2017

Uh oh!

amueller commented Dec 19, 2017

Uh oh!

gxyd commented Dec 20, 2017 via email

Uh oh!

gxyd commented Dec 20, 2017

Uh oh!

gxyd commented Dec 20, 2017

Uh oh!

amueller commented Dec 20, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

gxyd commented Dec 16, 2017 •

edited

Loading