Skip to content

DOC: sklearn.metrics.auc_score should mention that using probabilities will give better scores #1393

@tjanez

Description

@tjanez

The documentation at: http://scikit-learn.org/dev/modules/generated/sklearn.metrics.auc_score.html#sklearn.metrics.auc_score
says that y_score can be either probability estimates of the positive class, or binary decisions.

It should warn the reader that by using binary decisions, it is only able to compute AUC as if the classifier only returned probabilities 0 and 1 and thus not give the "real" AUC.

Here is an example:

from sklearn.linear_model import LogisticRegression
from sklearn import metrics
from sklearn import cross_validation
from sklearn import datasets

data = datasets.load_digits()
X, y = data.data, data.target
# make the classification problem binary
X = X[(y == 8) | (y == 6)]
y = y[(y == 8) | (y == 6)]

clf = LogisticRegression(C=0.001)

k_fold = cross_validation.KFold(len(y), k=10, indices=True, shuffle=True, random_state=18)

AUCs = []
AUCs_proba = []
for train, test in k_fold:
    clf.fit(X[train], y[train])
    AUCs.append(metrics.auc_score(y[test], clf.predict(X[test])))
    AUCs_proba.append(metrics.auc_score(y[test], clf.predict_proba(X[test])[:, 1]))

print "AUCs: "
print AUCs
print "AUCs (with probabilities): "
print AUCs_proba

This is the output:

AUCs: 
[1.0, 0.97222222222222221, 1.0, 0.97058823529411764, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
AUCs (with probabilities): 
[1.0, 1.0, 1.0, 0.99673202614379086, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]

I admit this is not a very good example, as the difference between AUCs and AUCs_proba could be a lot bigger in practice, but I wanted to use a built-in data set.

Note that AUC computed from binary decisions is always inferior to the AUC computed with probability estimates.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions