Skip to content

Redundant execution of check_class_weight_balanced_linear_classifier #33154

@atheendre130505

Description

@atheendre130505

Describe the bug and give evidence about its user-facing impact

Describe the bug and give evidence about its user-facing impact In sklearn/utils/estimator_checks.py, the function _yield_classifier_checks yields check_class_weight_balanced_linear_classifier twice for linear classifiers that support the class_weight parameter.

This bug was discovered during a manual code review of estimator_checks.py. It affects any scikit-learn user or developer who uses check_estimator (or the underlying check generators) to validate estimators.

The impact is redundant test execution. For example, when running common tests for a LogisticRegression or a custom linear classifier using the mixin, this specific check (which involves fitting the estimator and verifying class weight handling) is executed twice. This is wasteful in terms of CI/CD time and can lead to duplicated error messages or logs, making debugging more tedious.

Steps/Code to Reproduce

from sklearn.linear_model import LogisticRegression
from sklearn.utils.estimator_checks import _yield_classifier_checks
def get_name(check):
return getattr(check, "name",
getattr(check, "func", check).name)
clf = LogisticRegression()
checks = list(_yield_classifier_checks(clf))
check_names = [get_name(c) for c in checks]
duplicate_count = check_names.count(
'check_class_weight_balanced_linear_classifier'
)
print(f"Yield count: {duplicate_count}")

Expected Results

The check should be yielded exactly once. Yield count: 1

Actual Results

The check is currently yielded twice. Yield count: 2

Versions

python: 3.14.0
numpy: 2.4.1
scipy: 1.17.0
joblib: 1.5.3
scikit-learn: 1.7.dev0 (from source)

Interest in fixing the bug

I am interested in fixing this bug. I have already identified the root cause and implemented a fix.

Analysis of root cause: In _yield_classifier_checks, there are two consecutive if blocks that both check for isinstance(classifier, LinearClassifierMixin) and the presence of class_weight in parameters. Both blocks yield the same check.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions