[MRG+1] Completely support binary y_true in roc_auc_score by qinhanmin2014 · Pull Request #9828 · scikit-learn/scikit-learn

qinhanmin2014 · 2017-09-25T08:36:10Z

Reference Issue

Fixes #2723, proposed by @jnothman
Also see the discussions in #9805, #9567, #6874, #6873, #2616

What does this implement/fix? Explain your changes.

Currently, roc_auc_score only support either {0, 1} binary y_true or {-1, 1} binary y_true.
The PR completely support binary y_true. The basic thought is that for binary y_true, y_score is supposed to be the score of the class with greater label.
Use common test as the regression test.

Any other comments?

cc @jnothman

jnothman

Otherwise LGTM

jnothman · 2017-09-25T10:19:01Z

sklearn/metrics/tests/test_common.py

@@ -595,7 +595,8 @@ def test_invariance_string_vs_numbers_labels():

    for name, metric in THRESHOLDED_METRICS.items():
        if name in ("log_loss", "hinge_loss", "unnormalized_log_loss",


What are we excluding by this? what happens when we remove the condition?

qinhanmin2014 · 2017-09-25T10:46:14Z

@jnothman Thanks for the review.

What are we excluding by this? what happens when we remove the condition?

Test will fail because some metrics still only support {0, 1} y_true or {-1, 1} y_true.

After this PR, there are still some THRESHOLDED_METRICS which are excluded by the if statement:

# average_precision_score
average_precision_score
macro_average_precision_score
micro_average_precision_score
samples_average_precision_score
weighted_average_precision_score

# Multilabel ranking metrics
coverage_error
label_ranking_average_precision_score
label_ranking_loss

jnothman · 2017-09-25T10:53:19Z

I can't check now, but is that set complementary in some way, like supporting pos_label? I'd rather the condition be a blacklist (suggesting "not yet implemented" or "not applicable") than a whitelist, which would seem to defy the purpose of common tests

…

On 25 Sep 2017 8:46 pm, "Hanmin Qin" ***@***.***> wrote: @jnothman <https://github.com/jnothman> Thanks for the review. What are we excluding by this? what happens when we remove the condition? Test will fail because some metrics still only support {0, 1} y_true or {-1, 1} y_true. After this PR, there are still some THRESHOLDED_METRICS which are excluded by the if statement: # average_precision_score average_precision_score macro_average_precision_score micro_average_precision_score samples_average_precision_score weighted_average_precision_score # Multilabel ranking metrics coverage_error label_ranking_average_precision_score label_ranking_loss — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#9828 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEz67EVda_79DPjlQ-wqohfvDikV5gYks5sl4R5gaJpZM4PiYJ_> .

qinhanmin2014 · 2017-09-25T11:01:32Z

@jnothman Thanks.
Seems that a possible way is to use METRIC_UNDEFINED_BINARY instead of the awkward list. It also seems reasonable because we are testing binary y_true here. But this should be based on #9786 (move roc_auc_score out of METRIC_UNDEFINED_BINARY).

jnothman · 2017-09-25T12:32:08Z

would it be a good idea to merge the PRs so we can see the state of affairs more completely? should we have a regression test to show that cross_val_score works regardless of which label is positive? And I'm not sure I get why average precision would be undefined-binary, but I'm still not in a position to look at the code.

qinhanmin2014 · 2017-09-25T12:59:09Z

@jnothman Thanks for your instant reply.

would it be a good idea to merge the PRs so we can see the state of affairs more completely?

From my perspective, #9786 actually solves another problem (improve the stability of roc_auc_score) and is almost finished. There's hardly any direct relationship between the two PRs. So it might be better not to combine #9786 and this PR unless you insist.

should we have a regression test to show that cross_val_score works regardless of which label is positive?

Sorry but I don't quite understand the necessity of such test. If roc_auc_score works appropriately, then seems that the scorer based on roc_auc_score as well as cross_val_score should work appropriately? I can't find similar test currently. If you can point out something similar to me, I will be able to further understand the problem.

And I'm not sure I get why average precision would be undefined-binary, but I'm still not in a position to look at the code.

According to the doc and a glance at the source code, seems that we should also move average_precision_score out of METRIC_UNDEFINED_BINARY. I'll take care of it after #9786.

I have wrapped up our discussions about the common test along with some opinions from myself in #9829.

jnothman · 2017-09-25T23:35:27Z

Sounds good

qinhanmin2014 · 2017-09-28T03:10:27Z

I'd rather the condition be a blacklist (suggesting "not yet implemented" or "not applicable") than a whitelist, which would seem to defy the purpose of common tests

@jnothman Now, we can get rid of the awkward list. Is it OK for you? Thanks.

jnothman · 2017-09-28T04:05:42Z

That looks much better!

qinhanmin2014 · 2017-09-28T07:35:54Z

@jnothman Could you please give me some suggestions on how to make lgtm run? Thanks :)

qinhanmin2014 · 2017-09-29T14:46:04Z

@lesteve @amueller Could you kindly please give a second review? Thanks a lot :)

TomDLT · 2017-10-11T10:20:54Z

This is indeed much better than #9567, #6874, #2616

The basic use case seems to work:

import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_auc_score

np.random.seed(0)
n_samples, n_features = 100, 10
est = LogisticRegression()
X = np.random.randn(n_samples, n_features)
y = np.random.randint(2, size=n_samples)
classes = np.array(['good', 'not-good'])

for y_true in (classes[y], classes[1 - y]):
    est.fit(X, y_true)
    y_score = est.decision_function(X)
    print(roc_auc_score(y_true, y_score))
# 0.678090575275
# 0.678090575275

LGTM

…rn#9828)

fully support binary y_true

5474a3e

qinhanmin2014 mentioned this pull request Sep 25, 2017

roc_auc_score fails if dtype is object #2723

Closed

jnothman reviewed Sep 25, 2017

View reviewed changes

qinhanmin2014 mentioned this pull request Sep 25, 2017

TST possible improvements of metrics/tests/test_common #9829

Closed

7 tasks

jnothman changed the title ~~[MRG] Completely support binary y_true in roc_auc_score~~ [MRG+1] Completely support binary y_true in roc_auc_score Sep 25, 2017

qinhanmin2014 added 2 commits September 28, 2017 09:30

Merge remote-tracking branch 'upstream/master' into my-feature-3

ae970de

remove list

ef71023

qinhanmin2014 added 2 commits September 28, 2017 15:29

empty commit

053ab24

empty commit

568b4fb

qinhanmin2014 added 2 commits September 29, 2017 20:20

what's new

a0b144e

Merge remote-tracking branch 'upstream/master' into my-feature-3

d6a7d46

qinhanmin2014 mentioned this pull request Oct 4, 2017

[MRG+1] TST Move roc_auc_score from METRIC_UNDEFINED_BINARY to METRIC_UNDEFINED_MULTICLASS #9786

Merged

TomDLT merged commit 94db658 into scikit-learn:master Oct 11, 2017

qinhanmin2014 deleted the my-feature-3 branch October 11, 2017 11:04

maskani-moh pushed a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017

[MRG+1] Completely support binary y_true in roc_auc_score (scikit-lea…

5655aac

…rn#9828)

jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this pull request Dec 18, 2017

[MRG+1] Completely support binary y_true in roc_auc_score (scikit-lea…

3c496a2

…rn#9828)

qinhanmin2014 mentioned this pull request Apr 4, 2018

Wrong AUC from roc_auc_score #10914

Closed

		@@ -595,7 +595,8 @@ def test_invariance_string_vs_numbers_labels():

		for name, metric in THRESHOLDED_METRICS.items():
		if name in ("log_loss", "hinge_loss", "unnormalized_log_loss",

Uh oh!

Conversation

qinhanmin2014 commented Sep 25, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issue

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

jnothman left a comment

Choose a reason for hiding this comment

Uh oh!

jnothman Sep 25, 2017

Choose a reason for hiding this comment

Uh oh!

qinhanmin2014 commented Sep 25, 2017

Uh oh!

jnothman commented Sep 25, 2017 via email

Uh oh!

qinhanmin2014 commented Sep 25, 2017

Uh oh!

jnothman commented Sep 25, 2017 via email

Uh oh!

qinhanmin2014 commented Sep 25, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jnothman commented Sep 25, 2017

Uh oh!

qinhanmin2014 commented Sep 28, 2017

Uh oh!

jnothman commented Sep 28, 2017

Uh oh!

qinhanmin2014 commented Sep 28, 2017

Uh oh!

qinhanmin2014 commented Sep 29, 2017

Uh oh!

TomDLT commented Oct 11, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

qinhanmin2014 commented Sep 25, 2017 •

edited

Loading

qinhanmin2014 commented Sep 25, 2017 •

edited

Loading

TomDLT commented Oct 11, 2017 •

edited

Loading