[MRG+1] Isotonic calibration#1176
Conversation
examples/plot_calibration.py
Outdated
There was a problem hiding this comment.
You should use sklearn.utils.train_test_split to transform this block into 2 one-liners:
http://scikit-learn.org/dev/modules/generated/sklearn.cross_validation.train_test_split.html
|
"Brier score" seems to be the accepted name so I'm inclined to keep it that way. IIRC, in "Transforming Classifier Scores into Accurate Multiclass Probability Estimates" (KDD 2002), they study several methods and conclude that one-vs-rest in the most practical solution. |
hum. I am sure it will pass the consistency brigade :)
ok so IsotonicCalibrator should use OneVsRestClassifier internally and |
|
I guess |
|
About the Brier score being not a sklearn consistent score (higher == worst in this case): I don't really know what would be best as the sklearn naming convention is indeed conflicting with this official name. http://en.wikipedia.org/wiki/Brier_score I would be +0 for keeping the |
sklearn/metrics/metrics.py
Outdated
indeed in there experimental results they use OvR I think it's a good idea to fit an IR to each decision score. |
can we call it brier ? or brier_error ? as it's a mean squared error |
|
One solution would be to introduce decorators to label some functions as scores and others as losses. Of course, this is out of the scope of this PR... (This idea would also be useful to define whether a metric accepts predicted labels or predicted scores, c.f. the AUC issue) Another solution is to introduce a function So adding a note to the documentation as @ogrisel suggested seems like a good temporary solution to me. |
|
Some quick comments ...
|
|
hi paolo, thanks for this valuable feedback. I won't work on this in the next few |
+1 |
If you use more than one fold, how do you combine the results? |
sklearn/calibration.py
Outdated
There was a problem hiding this comment.
The above is not consistent with our API. So, a CV object passed to the constructor is a good idea in any case.
|
rebased on master + address some comments I needed a coding break... ;) If you feel like playing with it please do... |
|
You should provide a default value for the |
|
I've rebased and cleanup the example. I'am open to discussion regarding API and multiclass handling. I don't know how to take care of the OOB to estimate the probas. |
|
@mblondel any chance you can provide feedback on this? I'll get back to it tomorrow morning. |
sklearn/calibration.py
Outdated
There was a problem hiding this comment.
How about generating the calibration data with a cv object (parameter of the constructor)?
|
Changes Unknown when pulling bd727f3 on agramfort:isotonic_calibration into * on scikit-learn:master*. |
sklearn/metrics/metrics.py
Outdated
346e75e to
1a2a9ae
Compare
|
Alright, I squashed it now into 5 generic commits (calibration module, brier score, tests, examples, narrative doc). I've also added @mblondel to the list of authors in calibration.py |
|
I pushed a couple of commits (cosmit + coverage improvement). The coverage could be slightly improved. Currently we have 95% coverage. |
Great, thank you very much! |
Indeed there is a couple of exceptions that should be covered by additional https://coveralls.io/builds/1947924/source?filename=sklearn%2Fcalibration.py It should be easy to raise the coverage close to 99% on that file. |
838b06e to
db13132
Compare
|
I've added some assert_raises, coverage of calibration.py should now be effectively 100% |
|
Thanks. I think this is ok for merge. @agramfort @mblondel any more comments? |
|
+1 for merge on my side. |
|
Just to confirm: do we want really want Other than that +1 as well! |
|
Let's make sigmoid_calibration private indeed.
|
|
+1 for a private |
There was a problem hiding this comment.
Can you test this functions using the common tests in https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/metrics/tests/test_common.py ?
There was a problem hiding this comment.
Done (these tests are pretty nice; I didn't know they exist). I had to modify test_invariance_string_vs_numbers_labels() slightly such that pos_label is also set for THRESHOLDED_METRICS
|
Oops I had already merged, I will cherry pick this. |
|
Done! 🍻 Thank you very much everyone! |
|
yeah 🍻 !!!
|
|
🍻 indeed people. Congratulations!
|
|
Thanks for merging! 🍻 |
|
Congrats! When was this effort started again? Two? Three years ago? :) |
There was a problem hiding this comment.
This raises a warning as GaussianNB doesn't support sample_weights. If this is expected behavior, I think the warning should be caught.
There was a problem hiding this comment.
There was a problem hiding this comment.
Do we really want examples to raise warnings? ;) I thought the use of a classifier that doesn't have sample_weights was intentional.
There was a problem hiding this comment.
is it me or there is no use of sample_weight in this example?
There was a problem hiding this comment.
I thought it was used internally in the calibration. But you are right, it shouldn't warn if it is not used. Will investigate!
calibration module with platt and Isotonic calibration.
A few issues :