Skip to content

BUG: check_scoring fails for GridSearchCV #2853

@schwarty

Description

@schwarty

check_scoring returns a scorer depending on an estimator and a scoring argument. Since a GridSearchCV do not have a predict attribute when is it not fitted, check_scoring will default to return a passthrough_scorer (basically accuracy an scorer) regardless of the scoring attribute. In the following example, I'm trying to get f1 scores from a GridSearchCV(LinearSVC()): I get accuracy scores instead of f1 scores if I don't fit my classifier first, or if I clone it after fitting it.

from sklearn.datasets import make_blobs
from sklearn.grid_search import GridSearchCV
from sklearn.svm import LinearSVC
from sklearn.cross_validation import ShuffleSplit, cross_val_score
from sklearn import clone

from sklearn.metrics.scorer import check_scoring
from sklearn.metrics.scorer import _PredictScorer

X, y = make_blobs(centers=2, cluster_std=10., random_state=1)

svm_cv = GridSearchCV(LinearSVC(),
                      param_grid={'C': [.1, .5, 1., 5., 10., 50., 100.]},
                      scoring='f1')

cv = ShuffleSplit(y.size, random_state=1)

# test is not a scorer (actually is a passthrough_scorer)
assert not isinstance(check_scoring(svm_cv, 'f1'), _PredictScorer)
print 'accuracy:', cross_val_score(svm_cv, X, y, cv=cv, scoring='f1')

svm_cv.fit(X, y)
# test is a scorer 
assert isinstance(check_scoring(svm_cv, 'f1'), _PredictScorer)

print 'f1 score:', cross_val_score(svm_cv, X, y, cv=cv, scoring='f1')

# test is not a scorer (actually is a passthrough_scorer)
assert not isinstance(check_scoring(clone(svm_cv), 'f1'), _PredictScorer)

print 'accuracy:', cross_val_score(clone(svm_cv), X, y, cv=cv, scoring='f1')

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions