[MRG] Add calibration loss metric in classification#12479
[MRG] Add calibration loss metric in classification#12479aishgrt1 wants to merge 14 commits intoscikit-learn:masterfrom
Conversation
sklearn/metrics/classification.py
Outdated
| y_prob : array, shape (n_samples,) | ||
| Probabilities of the positive class. | ||
| bin_size : int | ||
| Size of the bin (samples) analysed in one iteration |
There was a problem hiding this comment.
in my opinion is better say: Size of the bin (samples) analysed on each iteration
sklearn/metrics/classification.py
Outdated
| """ | ||
| pos_loss = 0.0 | ||
| neg_loss = 0.0 | ||
| for bin_start in range(0, len(y_true) - bin_size + 1): |
There was a problem hiding this comment.
why don't use just range(len(y_true) - bin_size +1)
| pos_loss /= (len(y_true) - bin_size + 1) | ||
| neg_loss /= (len(y_true) - bin_size + 1) | ||
| loss = (0.5) * (pos_loss + neg_loss) | ||
|
|
There was a problem hiding this comment.
IMO it would better save the len(y_true) in a variable to don't call len() some times
| - actual_per_pos_class).sum() | ||
| pos_loss += bin_error_pos | ||
| actual_per_neg_class = (bin_size - y_true[bin_start:bin_end] | ||
| .sum()) / bin_size |
There was a problem hiding this comment.
If bin_size is negative this will raise a exception. Maybe have to add a control before.
There was a problem hiding this comment.
I am getting this error:
AssertionError: Docstring Error: sklearn.metrics.classification.calibration_loss arg mismatch:
sklearn/metrics/classification.py
Outdated
| Probabilities of the positive class. | ||
| bin_size : int | ||
| Size of the bin (samples) analysed in each iteration | ||
| Returns |
There was a problem hiding this comment.
Try to the let a newline below returns and try to --- match the length of Return
| Returns | |
| Returns | |
| ------- |
sklearn/metrics/classification.py
Outdated
| ------- | ||
| score : float | ||
| Calibration loss | ||
| Examples |
There was a problem hiding this comment.
The same
| Examples | |
| Examples | |
| -------- |
sklearn/metrics/classification.py
Outdated
| ---------- | ||
| y_true : array, shape (n_samples,) | ||
| True targets. | ||
| y_prob : array, shape (n_samples,) |
There was a problem hiding this comment.
| y_prob : array, shape (n_samples,) | |
| y_prob : array, shape (n_samples,) |
sklearn/metrics/classification.py
Outdated
| True targets. | ||
| y_prob : array, shape (n_samples,) | ||
| Probabilities of the positive class. | ||
| bin_size : int |
There was a problem hiding this comment.
| bin_size : int | |
| bin_size : int |
|
If you apply my comments this solve the Travis error |
|
@eamanu Done!! |
eamanu
left a comment
There was a problem hiding this comment.
LGTM. Please, edit the PR to add the [MRG] tag.
|
I need to add my name to the contributors list right? |
The changelog entry is sufficient, but it should be in v0.21.rst, not v0.20.rst |
jnothman
left a comment
There was a problem hiding this comment.
I'm not doing a full review yet. This needs to be in doc/modules/{model_evaluation,classes}.rst
|
this needs to add |
|
There's an interesting discussion of debiasing the calibration error in https://arxiv.org/pdf/1909.10155.pdf That's a current NeurIPS paper but the method they are discussing is actually already established, so it might be a good candidate. cc @thomasjpfan who has shown interest. |
Reference Issues/PRs
#10971
What does this implement/fix? Explain your changes.
Added calibration loss metric for classification
Any other comments?