[MRG + 1] brier_score_loss returns incorrect value when all y_true values are True/1 by gnsiva · Pull Request #9301 · scikit-learn/scikit-learn

gnsiva · 2017-07-08T11:41:47Z

This is in reference to this bug report.

Some of the added tests would not pass using the current version of brier_score_loss. I have added a potential fix for the issue. The solution isn't too pretty, but the only alternative I can see is to change the label_binarizer function in a way that is probably undesirable.

amueller · 2018-05-23T19:14:59Z

looks good. Can you please add an entry to the changelog in whatsnew?

gnsiva · 2018-05-24T15:19:06Z

Sure, thanks for reviewing it @amueller

amueller · 2018-05-24T15:21:10Z

doc/whats_new/v0.20.rst

  :issue:`9515` by :user:`Alan Liddell <aliddell>` and
  :user:`Manh Dao <manhdao>`.

+- Fixed a bug in:  :func:`metrics.brier_score_loss`, when all `y_true` values are 1,


single backticks don't do anything. please use double backticks! (they are fine for issue and user below, but not on their own)

oops, will fix now!

amueller · 2018-05-24T15:21:24Z

sklearn/metrics/classification.py

    pos_label : int or str, default=None
        Label of the positive class. If None, the maximum label is used as
-        positive class
+        positive class. If all values are 0/False, then 1 is used as pos_label.


maybe say "0 or False"

amueller · 2018-05-24T15:22:47Z

@jnothman do you think this is an ok fix? We probably need a general test, right?
@gnsiva could you see if you can generalize the test to apply to all metrics with a pos_label parameter, or possibly to all metrics? There are tests in metrics/test_common.py.

gnsiva · 2018-05-24T15:27:48Z

Sounds good, I'll look into it

jnothman · 2018-05-28T11:03:21Z

sklearn/metrics/tests/test_common.py

+
+
+@ignore_warnings
+def test_all_true_pos_label():


Thanks for this.

Should we be explicitly passing pos_label=1?

And I'm not sure about what we're saying about metrics where the second arg is a score or probability here...

Changed it to explicitly pass that.
I thought the test would be like the equivalent of check_classifiers_one_label but for those metrics.

jnothman · 2018-05-28T11:03:59Z

doc/whats_new/v0.20.rst

  :issue:`9515` by :user:`Alan Liddell <aliddell>` and
  :user:`Manh Dao <manhdao>`.

+- Fixed a bug in:  :func:`metrics.brier_score_loss`, when all ``y_true`` values are 1,


drop the first :. Change "1," to "1." and start a new sentence.

gnsiva · 2018-06-07T18:45:13Z

@jnothman thanks for taking a look before. I've made the changes suggested, please let me know if there is anything else you would like me to do.

jnothman

Otherwise LGTM

jnothman · 2018-06-09T10:57:22Z

sklearn/metrics/tests/test_common.py

+    all_ones = np.array([1, 1, 1])
+
+    for name in METRICS_WITH_POS_LABEL:
+        if not name == 'roc_curve':


This should be !=

jnothman · 2018-06-09T11:00:15Z

sklearn/metrics/tests/test_common.py

+    for name in METRICS_WITH_POS_LABEL:
+        if not name == 'roc_curve':
+            metric = ALL_METRICS[name]
+            perfect_score = metric(examples, examples, pos_label=1)


metric(examples, examples) is not necessarily perfect prediction score if the second argument is meant to be a score rather than a label. You've already seen that in ROC AUC. Is there some better way to characterise metrics for which 0 score does not necessarily mean "predicting the negative class", so that we can give the excepted metrics some name (like how THRESHOLDED_METRICS is used elsewhere in this file)?

Having a tough time thinking of a snappy name, but would something like METRICS_TARGET_VS_PREDICTION_WITH_POS_LABEL work? It would then encompass all the functions in METRICS_WITH_POS_LABEL except for roc_curve.

POSITIVE_SCORE_MEANS_POSITIVE_CLASS?

gnsiva · 2018-06-24T15:26:04Z

@jnothman thanks have made that change

jnothman

Now I'm just trying to work out how to express it in the negative. Perhaps just MAY_NOT_BE

gnsiva · 2018-06-25T07:46:55Z

To have a separate list containing only "roc_curve"?
Maybe it could be METRICS_COMPARE_TO_SCORE and METRICS_COMPARE_TO_LABEL?
Or perhaps POS_LABEL_METRICS_COMPARE_TO_SCORE POS_LABEL_METRICS_COMPARE_TO_LABEL

jnothman · 2018-06-25T08:03:50Z

I don't intuitively understand what those names mean. could you please describe them?

gnsiva · 2018-06-25T10:59:33Z

Yes sure, so the second argument of roc_curve is y_score, scores for how well the classifier performed (or probabilities etc.).
Whereas for all the other metrics in METRICS_WITH_POS_LABEL, the second argument is y_pred, as in the predicted class by the classifier.

So in the case of COMPARE_TO_SCORE I meant it compares the actual values to a score, and for COMPARE_TO_LABEL the metrics compare the actual label to the predicted class label

jnothman · 2018-06-25T11:12:50Z

Except that brier score takes a score?

gnsiva · 2018-06-25T11:16:16Z

oops, good point

amueller · 2018-07-20T18:35:32Z

what's the status of this?

qinhanmin2014 · 2018-08-18T03:06:05Z

I think the main issue here is the definition of pos_label=None (See #10010).
Also, seems that there's a bug here, e.g., brier_score_loss([-1, 0, -1], [0, -1, -1], pos_label=None) will be interpreted as brier_score_loss([-1, 0, -1], [0, -1, -1], pos_label=1)
I don't think we need so many tests for such a minor fix and I doubt whether the new common test is necessary.
Seems that the problem is not so serious, so my suggestion is to leave it to 0.21. I think we need to have a consistent definition of pos_label=None within the repo (or simply deprecate it).

gnsiva added 2 commits July 8, 2017 12:33

Added more tests for brier_score_loss

1f227e9

Potential fix for brier_score_loss bug

5c47b1d

gnsiva mentioned this pull request Jul 8, 2017

brier_score_loss returns incorrect value when all y_true values are True/1 #9300

Closed

gnsiva added 3 commits July 8, 2017 12:58

Made fix work for categorical data as well

88a94a1

Removed accidental duplication of test_brier_score_loss

5f15bc3

Fixed flake8 problems

e3294ec

qinhanmin2014 mentioned this pull request Aug 30, 2017

[MRG] Improvement and bug fix for brier_score_loss #9562

Closed

amueller changed the title ~~brier_score_loss returns incorrect value when all y_true values are True/1~~ [MRG + 1] brier_score_loss returns incorrect value when all y_true values are True/1 May 23, 2018

amueller approved these changes May 23, 2018

View reviewed changes

gnsiva added 3 commits May 24, 2018 13:32

Merge remote-tracking branch 'official/master' into brier_score_loss_bug

d681a02

Updated changelog

3c955c8

Merge remote-tracking branch 'official/master' into brier_score_loss_bug

1294fab

amueller reviewed May 24, 2018

View reviewed changes

gnsiva and others added 5 commits May 24, 2018 16:35

Fixed formatting comments

5347cbf

Merge remote-tracking branch 'official/master' into brier_score_loss_bug

18adb43

Fixed line width error

ff264f7

Added common test for all true where there is a pos_label arg

a4893a1

Merge remote-tracking branch 'official/master' into brier_score_loss_bug

38831ac

jnothman reviewed May 28, 2018

View reviewed changes

gnsiva added 2 commits June 6, 2018 16:01

Addressed comments

314437e

Merge remote-tracking branch 'official/master' into brier_score_loss_bug

0c8b50a

jnothman reviewed Jun 9, 2018

View reviewed changes

jnothman added this to the 0.20 milestone Jun 12, 2018

jnothman mentioned this pull request Jun 12, 2018

brier_score_loss error #11245

Closed

gnsiva added 2 commits June 24, 2018 09:06

Made specific list of metrics for common pos_label test

24cefdc

Merge remote-tracking branch 'official/master' into brier_score_loss_bug

f250968

jnothman reviewed Jun 24, 2018

View reviewed changes

qinhanmin2014 modified the milestones: 0.20, 0.21 Aug 18, 2018

jnothman removed this from the 0.21 milestone Apr 10, 2019

qinhanmin2014 mentioned this pull request Apr 12, 2019

[MRG] FIX Correct brier_score_loss when there's only one class in y_true #13628

Merged

glemaitre closed this in #13628 Apr 26, 2019



		@ignore_warnings
		def test_all_true_pos_label():

Uh oh!

Conversation

gnsiva commented Jul 8, 2017 • edited by jnothman Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

amueller commented May 23, 2018

Uh oh!

gnsiva commented May 24, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

amueller commented May 24, 2018

Uh oh!

gnsiva commented May 24, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gnsiva commented Jun 7, 2018

Uh oh!

jnothman left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gnsiva commented Jun 24, 2018

Uh oh!

jnothman left a comment

Choose a reason for hiding this comment

Uh oh!

gnsiva commented Jun 25, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jnothman commented Jun 25, 2018 via email

Uh oh!

gnsiva commented Jun 25, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jnothman commented Jun 25, 2018 via email

Uh oh!

gnsiva commented Jun 25, 2018

Uh oh!

amueller commented Jul 20, 2018

Uh oh!

qinhanmin2014 commented Aug 18, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

gnsiva commented Jul 8, 2017 •

edited by jnothman

Loading

gnsiva commented Jun 25, 2018 •

edited

Loading

gnsiva commented Jun 25, 2018 •

edited

Loading