Skip to content

[MRG+1] Fix broken L1 deprecations and svm scale C example.#4261

Merged
ogrisel merged 6 commits intoscikit-learn:masterfrom
raghavrv:svm_scale_c_plot
Feb 25, 2015
Merged

[MRG+1] Fix broken L1 deprecations and svm scale C example.#4261
ogrisel merged 6 commits intoscikit-learn:masterfrom
raghavrv:svm_scale_c_plot

Conversation

@raghavrv
Copy link
Copy Markdown
Member

fixes #4260

@ogrisel Please take a look...

  • Fix Broken plot_svm_scale_c.py example #4260
  • loss, penalty --> penalty, loss in error message
  • Add tests to check for error message when invalid loss and penalty are passed.
  • Deprecate and support L1 ( upper case )

@GaelVaroquaux
Copy link
Copy Markdown
Member

Though we explicitly specify that losses must be 'l1' and not 'L1', should we
support 'L1' ?

In my experience, it is a bad idea to introduce too much flexibility:
think of someone storing results in a dictionnary, where 'penalty' is a
key of the dictionnary. If we are flexible, that person will just hit a
bug in the dictionnary part, and not in our code.

@raghavrv
Copy link
Copy Markdown
Member Author

Ah makes sense! thanks for clarifying :)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be squared_hinge, instead of ssquared_hinge, no?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! I did this to test the first commit and forgot to change it back ;)

@ogrisel
Copy link
Copy Markdown
Member

ogrisel commented Feb 17, 2015

This example use to work in scikit-learn 0.15.2: http://scikit-learn.org/stable/auto_examples/svm/plot_svm_scale_c.html

This means that this change introduced without preserving backward compatibility. We need to add support for the old notation while raising informative deprecation warning messages such as:

loss='L2' has been deprecated and renamed to loss='squared_hingeloss' in 0.16. Support for the old parameter will be dropped in 0.18

@ogrisel
Copy link
Copy Markdown
Member

ogrisel commented Feb 17, 2015

Furthermore this API change should be documented in whats new:

http://scikit-learn.org/dev/whats_new.html#api-changes-summary

@raghavrv
Copy link
Copy Markdown
Member Author

'l2' is still supported... and does display that deprecation warning!! ( the problem was with the 'L2' ( upper case ) ).
I had change it to 'squared_hinge' assuming examples don't need deprecation?

Also see #3933 ...

@ogrisel
Copy link
Copy Markdown
Member

ogrisel commented Feb 17, 2015

This change of behavior might have been introduced by the integration of the new LinearSVR class that uses some common liblinear backend (this has to be checked).

If that the case we need to check that the LogisticRegression class is not similarly impacted. In that case we will to setup backward compat with deprecation warning there as well (only for the penalty as the loss is fixed for that model).

@ogrisel
Copy link
Copy Markdown
Member

ogrisel commented Feb 17, 2015

'l2' is still supported... and does display that deprecation warning!! ( the problem was with the 'L2' ( upper case ) ).

L2 with uppercase needs to be supported (with deprecation warning) as it was part of the public API in the last stable release 0.15.2.

@raghavrv
Copy link
Copy Markdown
Member Author

L2 with uppercase needs to be supported

Oh! okay! will find out what caused the change and deprecate it properly :)

@ogrisel
Copy link
Copy Markdown
Member

ogrisel commented Feb 17, 2015

By reading the diff of https://github.com/scikit-learn/scikit-learn/pull/3933/files , the code used to be case insensitive, which is no longer the case. +1 for deprecating explicitly the upper case notation for the l1 and l2 penalties and losses.

@amueller
Copy link
Copy Markdown
Member

LGTM apart from @ogrisel comment on tighter testing and the typo that gives the travis error ;)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also add backward compat for penalty in ('L1', 'L2') (uppercase) that maps to the l1 or l2 (lowercase) with a similar deprecation warning.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And add deprecation warning tests for those. Please git grep "assert_warns_message(DeprecationWarning" to find examples on how to write such tests.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is done in this line...

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

woops I clicked on the edit button instead of the reply button and deleted part of your reply by mistake... Sorry for that.

Anyway I wanted to also let you know that you can drop the nested parens in the above call to warnings.warn.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! thats good to know! thanks... I keep doing that a lot! ;)

@ogrisel ogrisel changed the title [MRG] Svm scale c plot [MRG] Broken svm scale C example Feb 18, 2015
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to provide explicit and informative warning and error messages that recall the values of both the observed and expected values. In this case:

if loss in ('L1', 'L2'):
    warnings.warn("loss='%s' has been deprecated in favor of loss='%s' as of 0.16. "
                  "Backward compatibility for the uppercase notation will be removed"
                  " in 0.18." % (loss, loss.lower()))

and a similar message for penalty.

We should not assume that the user has the source code at hand with the precise line where this class is instantiated when executing a program that causes this message to be raised. Putting explicit parameter values in the message is likely to help him / her find the source of the problem much more quickly.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done ;)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done ;)

This is not addressed in the last commit currently visible on github (635b933). Travis is still broken at this point.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah :/ Sorry!! I'll fix this soon... ( I must have forgotten to push I guess )

@amueller amueller added this to the 0.16 milestone Feb 23, 2015
@raghavrv raghavrv force-pushed the svm_scale_c_plot branch 2 times, most recently from 01143b7 to a878fac Compare February 24, 2015 19:49
@raghavrv
Copy link
Copy Markdown
Member Author

@ogrisel could you please take a look at this now? Have added the tests and done deprecations as follows -

  • loss 'L1' / 'l1' deprecated with warning to 'hinge' for LinearSVC or 'epsilon_insensitive' for LinearSVR inside their respective __init__s. ( like wise for 'L2'/'l2' )
  • Uppercase notations for loss ( when loss is not 'l1'/'l2' ) and penalty are deprecated inside _fit_liblinear and warning will be raised when they are fit ( to support LinearSVC / LinearSVR / LogisticRegression and LogisticRegressionCV )

@amueller
Copy link
Copy Markdown
Member

You can't deprecate inside an __init__ as the attribute might be set using grid search. So you need to deprecate in fit.

@amueller
Copy link
Copy Markdown
Member

Tests still failing btw.

@raghavrv raghavrv force-pushed the svm_scale_c_plot branch 2 times, most recently from 8e055a8 to 802e5b4 Compare February 25, 2015 10:59
@raghavrv
Copy link
Copy Markdown
Member Author

done! please take a look now :)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps the old one was more readable?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean the formatting of the nested dict? I agree I preferred the old formatting.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea :) thanks will revert that!

@ogrisel
Copy link
Copy Markdown
Member

ogrisel commented Feb 25, 2015

Apart from this last cosmetic comment, +1 on my side.

@raghavrv raghavrv changed the title [MRG] Broken svm scale C example [MRG+1] Broken svm scale C example Feb 25, 2015
@ogrisel
Copy link
Copy Markdown
Member

ogrisel commented Feb 25, 2015

Ok merging this!

ogrisel added a commit that referenced this pull request Feb 25, 2015
[MRG+1] Broken svm scale C example
@ogrisel ogrisel merged commit a926626 into scikit-learn:master Feb 25, 2015
@ogrisel
Copy link
Copy Markdown
Member

ogrisel commented Feb 25, 2015

Thanks again @ragv!

@raghavrv raghavrv deleted the svm_scale_c_plot branch March 18, 2015 22:57
@raghavrv raghavrv changed the title [MRG+1] Broken svm scale C example [MRG+1] Fix broken L1 deprecations and svm scale C example. Mar 23, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Broken plot_svm_scale_c.py example

4 participants