FIX make sure OutputCodeClassifier rejects complex labels#20219
FIX make sure OutputCodeClassifier rejects complex labels#20219ogrisel wants to merge 1 commit intoscikit-learn:mainfrom
Conversation
|
BTW, I could not reproduce the original failure on my machine. I am not sure why the nested call to |
| """ | ||
| y = column_or_1d(y, warn=True) | ||
| _assert_all_finite(y) | ||
| _ensure_no_complex_data(y) |
There was a problem hiding this comment.
Is check_X_y doing it. We might have forgotten to add this check in some other PR of @jeremiedbb
There was a problem hiding this comment.
No it's not. It's just that the _ConstantPredictor does no check at all. There's a discussion in #20205
There was a problem hiding this comment.
Actually we don't, but we probably should:
https://github.com/scikit-learn/scikit-learn/blob/main/sklearn/utils/validation.py#L883-L890
There was a problem hiding this comment.
No it's not. It's just that the _ConstantPredictor does no check at all. There's a discussion in #20205
But why does this test pass most of the time and fails randomly? I could not reproduce the failure on my machine. Also I don't understand why one would get the constant predictor to kick in, in this case. The RNG seems properly seeded, so one should always get the validation of X by LogisticRegression._validate_data.
There was a problem hiding this comment.
Actually, no we do not set the random_state in check_complex_data on the estimator clone. That does not explain why this test does not fail for me though, but that is a bug on its own. I will submit another PR.
|
Closing in favor of #20221 (hopefully). |
This is a quick fix for the random failure reported in #20218 after the merge of #20192. I am not sure about the root cause of the randomness but we indeed need to factorize
yvalidation whenXis not validated in a meta-estimator as @jeremiedbb suggested elsewhere.In the mean time, let's make sure that the CI can be green in main to avoid disturbing concurrent PRs.