[MRG+2] Nonrepeating ROC thresholds#3268
Conversation
…es in the thresholds returned by roc_curve because of machine precision, and a quick stab at a fix.
There was a problem hiding this comment.
Could you please use explicit train test splits instead of slices with steps?
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.5, random_state=0)I find the code more readable with explicit variable names instead of slices.
There was a problem hiding this comment.
Using train_test_split would necessitate adding a new import to the test file, which I'd rather not do. What if I were to switch to named slice objects? That would provide the same readability.
…lds test. Added named slice variables to same, for readability.
… ROC thresholds. Added a comment explaining why the treatment is needed.
sklearn/metrics/metrics.py
Outdated
There was a problem hiding this comment.
Hm, np.isclose was added in numpy 1.7, so this will not work with earlier versions. I will look into fixing this tomorrow
|
I have reproduce your testing case From my understanding, the issue comes from floating point rounding in Furthermore, the auc doesn't change much even with 2 digits rounding: |
|
Arnaud, There is no issue in Tonight I'll push a changeset that fixes the double use of |
|
Still the issue is not in roc_curve which worked as expected (168 unique value in, 168 thresholds out), but from the summation On a side note if this really mater, you could do that sort of summation using a kahan-summation algorithm or even better a Kahan-Babuska-Summation-Algorithm. |
|
I see your point. 168 unique values in, 168 thresholds out indeed. But in my opinion, a better algorithm for selecting ROC curve thresholds would be to use only the However, if the prevailing opinion is not to modify the threshold selection algorithm, I will withdraw this pull request. |
|
I also find machine precision levels duplicated thresholds curious and it might cause artifacts in a matplotlib display of the curve. On the other hand don't think it worth slowing down the computation of |
|
Are they any further concerns or obstacles to merging this one? |
|
This looks good to me. +1 on my side. @arjoly have you changed your mind or not? Other's opinion? @GaelVaroquaux @larsmans @jnothman @mblondel @agramfort? |
|
I haven't changed my mind, but it's ok if you find other people supporting this pull request. |
|
I am fine with this. +1 |
|
(although I do wonder whether the test is robust to changes in the randomforestclassifier implementation) |
|
Merging then. If the random forest implementation changes in a way that makes this test fail, we can always rewrite it to use fixed values. However I think it's best to show a real case that stems from the library. |
[MRG+2] Nonrepeating ROC thresholds
Added a unit test to ensure that there are no spurious repeating values in the thresholds returned by roc_curve because of machine precision, and a quick stab at a fix.
Without the fix, the thresholds array produced in the unit test starts with the values [1, 0.99, 0.98, 0.98, ...]. There should be only one 0.98. The undesirable behavior stems from machine epsilon differences between predicted probabilities that are mathematically equivalent.
I am not sure that my suggested fix is the best approach, so I welcome comments.