Skip to content

LogisticRegressionCV doesn't calculate score as a multiclass problem #8737

@amueller

Description

@amueller
from sklearn.linear_model import LogisticRegressionCV
from sklearn import datasets
iris = datasets.load_iris()

LogisticRegressionCV(scoring="f1").fit(iris.data, iris.target)

This runs and I think it should error.
By default f1 uses binary averaging, which makes no sense for multiclass IIRC.
I'm sure @jnothman knows more about this than me.
I'd argue this is a bug in the f1_score which should error if average='binary' but the targets are not binary.
Though I vaguely remember that we discussed that we want to enable this behavior to compute the f1-score for a single class that is selected by pos_label but that seems to be covered by
this

The class to report if average='binary' and the data is binary. If the data are multiclass or multilabel, this will be ignored; setting labels=[pos_label] and average != 'binary' will report scores for that label only.

The docstring seems exactly right to me but the behavior is different.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions