Skip to content

[MRG+1] ENH add a parameter to SVC and NuSVC to break ties in predict if desired#12557

Merged
NicolasHug merged 35 commits intoscikit-learn:masterfrom
adrinjalali:svc/tie-breaking
May 3, 2019
Merged

[MRG+1] ENH add a parameter to SVC and NuSVC to break ties in predict if desired#12557
NicolasHug merged 35 commits intoscikit-learn:masterfrom
adrinjalali:svc/tie-breaking

Conversation

@adrinjalali
Copy link
Copy Markdown
Member

Fixes #8277

This is a proposal to fix the issue in SVC and NuSVC where the predict does not break ties but decision_function does, when decision_function_shape='ovr' and n_classes > 2

The proposal is to add a parameter to BaseSVC, and return argmax of decision_function instead of calling libsvm's predict, which breaks ties according to the confidence values.

The default value of the parameter is False to keep it backward compatible, and if needed, in the future, switch the default value to True in an appropriate deprecation cycle.

@agramfort
Copy link
Copy Markdown
Member

@adrinjalali travis is not happy.

Copy link
Copy Markdown
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @adrinjalali

@adrinjalali
Copy link
Copy Markdown
Member Author

I'm not sure what whats_new tags to use here though.

@jnothman
Copy link
Copy Markdown
Member

I'm not sure what whats_new tags to use here though.

It's an enhancement, IMO.

Copy link
Copy Markdown
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpicks

@adrinjalali adrinjalali changed the title add a parameter to SVC and NuSVC to break ties in predict if desired [MRG+1] add a parameter to SVC and NuSVC to break ties in predict if desired Nov 28, 2018
@adrinjalali adrinjalali changed the title [MRG+1] add a parameter to SVC and NuSVC to break ties in predict if desired [MRG+1] ENH add a parameter to SVC and NuSVC to break ties in predict if desired Nov 29, 2018
@adrinjalali adrinjalali added this to the 0.21 milestone Dec 19, 2018
@jnothman
Copy link
Copy Markdown
Member

Tests failing?

Copy link
Copy Markdown
Member

@rth rth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good apart for one comment below.

Copy link
Copy Markdown
Member

@GaelVaroquaux GaelVaroquaux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, aside from my two comments. The naming is a nitpick which can be ignored, but I feel that the random_state is important.

If true, `decision_function_shape`='ovr', and number of classes > 2,
`predict` will break ties according to the confidence values of
`decision_function`. Please note that breaking ties comes
at a relatively high computational cost compared to a simple predict.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the practical utility from this explanation may not be clear to everybody

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, documentation of the default=False case is missing

Deprecated *decision_function_shape='ovo' and None*.

break_ties : bool, optional (default=False)
If true, `decision_function_shape`='ovr', and number of classes > 2,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the practical utility from this explanation may not be clear to everybody

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any suggestions?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can put in a very common use case that a wide spectrum of data analysts are likely to have encountered in some shape or form.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

those details are suitable for .rst documentations, not the docstrings.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I must confess that I am unsure what exactly the benefits of turning this on is. Let me explain what I think I have understood that the benefit is, and maybe it can help the docstring:
"Using confidence to break ties can lead to a better prediction for this specific instances".

Is that correct? If so, I would suggest that this sentence is added before "Please note that..."

@jnothman
Copy link
Copy Markdown
Member

I can't see the difference between those images.

@azrdev
Copy link
Copy Markdown

azrdev commented Apr 15, 2019

I can't see the difference between those images.

@jnothman It's the area "inside" the intersection of the three borders, at around (1.5, 8) which has a concave extension of the top-left green class, iff tie breaking is used

@adrinjalali
Copy link
Copy Markdown
Member Author

@jnothman I increased the plot size a bit, I hope that helps.

@NicolasHug
Copy link
Copy Markdown
Member

I find it hard to see the difference between the two plots (took me more than a minute to see that the regions weren't the same at the center).

The description should explain what differences to look for IMO.

@adrinjalali
Copy link
Copy Markdown
Member Author

adrinjalali commented Apr 22, 2019

@thomasjpfan I'm not sure why the two azure pipelines are failing, the failing test is not touched by this PR (and I can't figure what the error is anyway).

EDIT: is/was fixed on master

@adrinjalali
Copy link
Copy Markdown
Member Author

@NicolasHug does this look good now?

Copy link
Copy Markdown
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, a few comments still.

Not sure if we want to raise an error if break_ties is True and decision_function_shape is not 'ovr'. I'd say yes but apparently the default is subject to change?

@adrinjalali
Copy link
Copy Markdown
Member Author

Not sure if we want to raise an error if break_ties is True and decision_function_shape is not 'ovr'. I'd say yes but apparently the default is subject to change?

I'm not sure. I'd leave that to another discussion maybe.

@NicolasHug
Copy link
Copy Markdown
Member

I'm not sure. I'd leave that to another discussion maybe.

If we merge this before 0.21, it will be too late. I think we should raise an error, we've been very conservative so far on what users can do.

@adrinjalali
Copy link
Copy Markdown
Member Author

@NicolasHug done :)

Copy link
Copy Markdown
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last comment, LGTM otherwise!

@NicolasHug NicolasHug merged commit 384c8ad into scikit-learn:master May 3, 2019
@NicolasHug
Copy link
Copy Markdown
Member

Thanks Adrin!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SVC.decision_function disagrees with predict

8 participants