FIX Infer pos_label automatically in plot_roc_curve by qinhanmin2014 · Pull Request #15316 · scikit-learn/scikit-learn

qinhanmin2014 · 2019-10-21T14:59:44Z

Fixes #15303
I think we should drop parameter pos_label in plot_roc_curve, because we do not have parameter pos_label in roc_auc_score. In roc_auc_score, for binary y_true, y_score is supposed to be the score of the class with greater label.

jnothman · 2019-10-22T13:54:10Z

I think we should drop parameter pos_label in plot_roc_curve, because we do not have parameter pos_label in roc_auc_score.

The reason we lack the parameter in roc_auc_score is that the score is symmetric with respect to the choice of the positive label. roc_auc_score(y_true, score) == roc_auc_score(y_true, -score) for binary y_true. But this same operation tranposes the plotted curve, so for plotting the choice of pos_label is significant.

qinhanmin2014 · 2019-10-23T10:38:43Z

The reason we lack the parameter in roc_auc_score is that the score is symmetric with respect to the choice of the positive label. roc_auc_score(y_true, score) == roc_auc_score(y_true, -score) for binary y_true

Sorry but I can't understand this part, should be roc_auc_score(y_true, score) == 1 - roc_auc_score(y_true, -score)?

qinhanmin2014 · 2019-10-23T10:45:49Z

But this same operation tranposes the plotted curve, so for plotting the choice of pos_label is significant.

Well we can keep pos_label parameter and define pos_label=None as the greater label (though this will be inconsistent with roc_auc_curve), but I can't understand why the choice of pos_label is significant here, the roc_auc_score of a model can be less than 0.5?

jnothman · 2019-10-23T12:48:07Z

Yes, you're right... now trying to remember which similar lemma is the reason we don't need pos_label in roc_auc_score. I know I've justified it to myself before! But you can certainly choose either class as the positive one, and give the corresponding ranking and get the same score: roc_auc_score(y_true == 1, score) == roc_auc_score(y_true == 0, -score)

qinhanmin2014 · 2019-10-23T13:26:27Z

Yes, you're right... now trying to remember which similar lemma is the reason we don't need pos_label in roc_auc_score. I know I've justified it to myself before!

Yes the decision was (mainly) made by you (#2723 (comment), #2723 (comment)) and the PR was submitted by me. Honestly I'm still unable to understand the reason. (Please kindly forgive me.)

Perhaps it's better to drop pos_label parameter here, because in roc_auc_score, we regard the class with greater label as the positive class. If we want to support pos_lable here, perhaps we need to support it in roc_auc_score at the same time.

jnothman · 2019-10-23T13:57:36Z

Firstly can we agree that plot_roc_curve is like roc_curve, *not* like roc_auc_score? So support for pos_label should reflect roc_curve. But yes, while I don't think it makes sense for the user to configure the pos_label for roc_auc_score where the scores are coming from a scikit-learn binary estimator, I suppose there's a little contradiction to then support it in roc_curve... Tired, can't think further about it.

qinhanmin2014 · 2019-10-23T14:07:16Z

Firstly can we agree that plot_roc_curve is like roc_curve, not like roc_auc_score? So support for pos_label should reflect roc_curve.

If we keep consistent with roc_curve (we do so currently), seems that we won't be able to solve Andy's issue, because for roc_curve, pos_label = None(default) is the same as pos_label=1.

while I don't think it makes sense for the user to configure the pos_label for roc_auc_score where the scores are coming from a scikit-learn binary estimator,

Sorry I can'r understand this part. average_precision_score also comes from binary estimator but we support pos_label there?

qinhanmin2014 · 2019-10-30T05:03:06Z

ping @thomasjpfan ready for a review

ogrisel · 2019-10-30T14:42:33Z

Sorry for late reaction but I have a ~~concurrent suggestion~~ complementary PR in #15405.

ogrisel

@qinhanmin2014 I pushed some improvements. Otherwise LGTM.

ogrisel · 2019-11-06T16:09:17Z

Based on the discussion in #15405 (comment), maybe we should actually remove pos_label from plot_roc_curve and let it always use estimator.classes_[1] internally.

WDYT @jnothman @qinhanmin2014 @thomasjpfan @amueller?

amueller · 2019-11-06T20:25:12Z

See #15405 (comment)

Unless we want to change the behavior to also slice predict_proba, then we should remove it.

FIX Remove pos_label parameter in plot_roc_curve

9a30d3e

qinhanmin2014 added the Blocker label Oct 21, 2019

qinhanmin2014 added this to the 0.22 milestone Oct 21, 2019

typo

3aa4fd8

thomasjpfan self-requested a review October 21, 2019 15:05

matplotlib

b1b6256

qinhanmin2014 added 5 commits October 30, 2019 09:28

Merge remote-tracking branch 'upstream/master' into plot_roc_auc

cc0a8e6

new solution

94e8e52

new solution

27a3d4d

finish new solution

675b7ea

another solution

b4d9ccb

qinhanmin2014 changed the title ~~FIX Remove pos_label parameter in plot_roc_curve~~ FIX Infer pos_label automatically in plot_roc_curve Oct 30, 2019

qinhanmin2014 added 2 commits October 30, 2019 13:01

typo

2fa0cbc

typo

1a83d36

qinhanmin2014 mentioned this pull request Oct 30, 2019

plot_roc_curve doesn't correctly infer pos_label #15303

Closed

ogrisel mentioned this pull request Nov 6, 2019

Improve error message with invalid pos_label in plot_roc_curve and implicit pos_label in precision_recall_curve #15405

Closed

ogrisel added 5 commits November 6, 2019 14:46

Improve docstring and exception error messages.

ed0bef1

flake8: Fix invalid escape sequence in test

85aaaea

Fix missing words in comment

cc48c3e

Merge remote-tracking branch 'origin/master' into plot_roc_auc

ed019c5

Add test with {1, 2} integer labels

93e1bb3

ogrisel approved these changes Nov 6, 2019

View reviewed changes

Remove redundant assert

6c0a497

thomasjpfan mentioned this pull request Nov 6, 2019

[MRG] BUG Remove pos_label in plot_roc_auc_curve #15555

Merged

qinhanmin2014 closed this Nov 7, 2019

qinhanmin2014 deleted the plot_roc_auc branch November 7, 2019 12:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FIX Infer pos_label automatically in plot_roc_curve#15316

FIX Infer pos_label automatically in plot_roc_curve#15316
qinhanmin2014 wants to merge 16 commits intoscikit-learn:masterfrom
qinhanmin2014:plot_roc_auc

qinhanmin2014 commented Oct 21, 2019

Uh oh!

jnothman commented Oct 22, 2019

Uh oh!

qinhanmin2014 commented Oct 23, 2019

Uh oh!

qinhanmin2014 commented Oct 23, 2019

Uh oh!

jnothman commented Oct 23, 2019 via email

Uh oh!

qinhanmin2014 commented Oct 23, 2019

Uh oh!

jnothman commented Oct 23, 2019 via email

Uh oh!

qinhanmin2014 commented Oct 23, 2019

Uh oh!

qinhanmin2014 commented Oct 30, 2019

Uh oh!

ogrisel commented Oct 30, 2019 •

edited

Loading

Uh oh!

ogrisel left a comment

Uh oh!

ogrisel commented Nov 6, 2019

Uh oh!

amueller commented Nov 6, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

qinhanmin2014 commented Oct 21, 2019

Uh oh!

jnothman commented Oct 22, 2019

Uh oh!

qinhanmin2014 commented Oct 23, 2019

Uh oh!

qinhanmin2014 commented Oct 23, 2019

Uh oh!

jnothman commented Oct 23, 2019 via email

Uh oh!

qinhanmin2014 commented Oct 23, 2019

Uh oh!

jnothman commented Oct 23, 2019 via email

Uh oh!

qinhanmin2014 commented Oct 23, 2019

Uh oh!

qinhanmin2014 commented Oct 30, 2019

Uh oh!

ogrisel commented Oct 30, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ogrisel left a comment

Choose a reason for hiding this comment

Uh oh!

ogrisel commented Nov 6, 2019

Uh oh!

amueller commented Nov 6, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ogrisel commented Oct 30, 2019 •

edited

Loading