[MRG+2] ENH&BUG Add pos_label parameter and fix a bug in average_precision_score#9980
[MRG+2] ENH&BUG Add pos_label parameter and fix a bug in average_precision_score#9980amueller merged 11 commits intoscikit-learn:masterfrom qinhanmin2014:average_precision_pos_label
Conversation
|
ping @jnothman Could you please take some time to review it or at least judge whether this is the right way to go? Thanks a lot :) |
|
I've been taking a little break from review to work on other things, while
still doing triage. this PR is not lost, merely waiting
|
|
ping @jnothman for a review. Thanks :) |
sklearn/metrics/ranking.py
Outdated
|
|
||
| y_type = type_of_target(y_true) | ||
| if y_type == "binary": | ||
| _partial_binary_uninterpolated_average_precision = partial( |
There was a problem hiding this comment.
This is used once. It can have a short, uninformative name...
sklearn/metrics/ranking.py
Outdated
| Calculate metrics for each instance, and find their average. | ||
|
|
||
| pos_label : int or str (default=1) | ||
| The label of the positive class. For multilabel-indicator y_true, |
There was a problem hiding this comment.
Perhaps clarify that this is only applied to binary classification. We do not, for instance, binarize a multiclass problem using pos_label...
sklearn/metrics/ranking.py
Outdated
| return _average_binary_score( | ||
| _partial_binary_uninterpolated_average_precision, y_true, | ||
| y_score, average, sample_weight=sample_weight) | ||
| else: |
There was a problem hiding this comment.
are we sure at this stage that the y_type is not multiclass?
sklearn/metrics/ranking.py
Outdated
| raise ValueError("Parameter pos_label is fixed to 1 for " | ||
| "multilabel-indicator y_true. Do not set " | ||
| "pos_label or set pos_label to 1.") | ||
| return _average_binary_score( |
There was a problem hiding this comment.
Why don't you outdent this and use it in all cases?
|
@jnothman Thanks a lot for the review :) Comments addressed. I also simplify the code.
At this point not. But we do the check once we finish the preparation and enter |
| ---------- | ||
| y_true : array, shape = [n_samples] or [n_samples, n_classes] | ||
| True binary labels (either {0, 1} or {-1, 1}). | ||
| True binary labels or binary label indicators. |
There was a problem hiding this comment.
label indicators -> multilabel indicators ??
|
@jnothman Thanks for the review :) I update what's new (not sure whether we need two entries)
I'm not creating term but just reverting the previous change (#9557) since we now extend the function. We are using the term |
|
sounds alright. I'm just trying to help users new to the terminology. they
are label indicators, used for multilabel representation. some of the
terminology is confused for historical reasons: we used to have an
alternative multilabel format.
|
GaelVaroquaux
left a comment
There was a problem hiding this comment.
This looks good aside from the cosmetic comment.
sklearn/metrics/ranking.py
Outdated
| import numpy as np | ||
| from scipy.sparse import csr_matrix | ||
| from scipy.stats import rankdata | ||
| from functools import partial |
There was a problem hiding this comment.
Cosmetic: you should move this import to the top of the imports, as it is an import from the standard library.
|
I am addressing the cosmetic comment myself and merging. |
GaelVaroquaux
left a comment
There was a problem hiding this comment.
LGTM.
Merging when CI is green.
|
Thanks a lot @GaelVaroquaux :) |
Reference Issues/PRs
part of #9829
What does this implement/fix? Explain your changes.
(1)add pos_label parameter to average_precision_score (Although we finally decide not to introduce pos_label in roc_auc_score, I think we might need pos_label here. Because there are no relationship between the results if we reverse the true labels, also, precision/recall all support pos_label)
(2)fix a bug where average_precision_score will sometimes return nan when sample_weight contains 0
I do it here because of (3)
(3)move average_precision scores out of METRIC_UNDEFINED_BINARY (this should contain the regression test for (1) and (2))
Some comments:
(1)For the underlying method(precision_recall_curve), the default value of pos_label is None, but I choose to set the default value of pos_label to 1 because this is what P/R/F is doing. What's more, the meaning of pos_label=None is not clear even in scikit-learn itself (see #10010)
(2)I slightly modified the common test. Currently, the part I modified is only designed for brier_score_loss(I'm doing the same thing in #9562) . I think it is right because as a common test, it seems not good to force metrics to accept str y_true without pos_label.
Any other comments?
cc @jnothman Could you please take some time to review or at least judge whether this is the right way to go? Thanks a lot :)