FIX ensure to return 1-D array with single target in PLS* and CCA · Pull Request #20355 · scikit-learn/scikit-learn

ghost · 2021-06-25T01:59:47Z

Reference Issues/PRs

Superseded #19409
Fixes #19352
Fixes #19409

What does this implement/fix? Explain your changes.

Fixes #19352 so IterativeImputer works with PLSRegression as estimator.

Any other comments?

Added test case in existing shape test.
test_impute.py coverage with 97%
Passed all imputation tests.

NicolasHug

thanks for the PR @polocorona , this looks good but instead of changing the imputer, I think we should be updating the PLSRegression estimator as suggested in #19352 (comment)

ghost · 2021-06-25T20:04:22Z

@NicolasHug Done. Let me know if I understood your request correctly. 😄

NicolasHug

Thanks @polocorona , this looks great!

I made some minor comments, and we could also add a test in test_pls.py to make sure that ndim == 1 for single-target regression.

Pinging @ogrisel @glemaitre @jjerphan as you reviewed #19409, I think this one is more correct.

doc/whats_new/v1.0.rst

sklearn/impute/tests/test_impute.py

NicolasHug · 2021-06-28T08:45:53Z

BTW this doesn't affect just PLSRegression but all the estimators that inherit from _PLS so we should test and mention these ones as well

NicolasHug

Thanks a lot @polocorona ! This LGTM with some minor comments

doc/whats_new/v1.0.rst

sklearn/cross_decomposition/tests/test_pls.py

jjerphan

Thanks @polocorona for your contribution!

I just have one comment before approving.

sklearn/cross_decomposition/_pls.py

jjerphan

LGTM, thanks @polocorona!

ghost · 2021-06-30T17:22:28Z

@NicolasHug should I wait for another review for the merge?

jjerphan · 2021-06-30T18:49:45Z

@polocorona: yes, two core-devs must give their approvals.

ogrisel

Although I am not sure many people use PLS and CCA with 1d targets, I am worried about backward compat. See below:

ogrisel · 2021-07-06T07:42:45Z

sklearn/cross_decomposition/tests/test_pls.py

+    X_test = np.random.randn(10, 3)
+    Y_pred = pls.predict(X_test)
+
+    assert Y_pred.shape == (10,)


I am worried this is a breaking change for users who already have code running with 1d Y shaped as (n_samples, 1). Would it work for the imputation meta estimator if we made the estimator record the shape of Y at fit time to reuse it at predict time?

X = np.random.randn(10, 3) Y = np.random.randn(10, 1) pls = Estimator(n_components=Y.shape[1]) pls.fit(X, Y) X_test = np.random.randn(10, 3) Y_pred = pls.predict(X_test) assert Y_pred.shape == (10, 1) pls = Estimator(n_components=Y.shape[1]) pls.fit(X, np.ravel(Y)) X_test = np.random.randn(10, 3) Y_pred = pls.predict(X_test) assert Y_pred.shape == (10,)

I guess that'd be fine.

We might not need to explicitly record y.shape as it should be embedded already in self.y_weights_ and self.y_loadings_

But for what it's worth, we're not consistent at all w.r.t. the output shape when passing y with y.shape == (n_samples, 1). Here are a few estimators:

Estimator , output shape , 'supports multioutput' HistGradientBoostingClassifier, (100,) , False HistGradientBoostingRegressor , (100,) , False RandomForestClassifier , (100,) , True RandomForestRegressor , (100,) , True LogisticRegression , (100,) , False LinearRegression , (100, 1) , True SGDClassifier , (100,) , False SGDRegressor , (100,) , False SVC , (100,) , False PLSRegression , (100, 1) , True

Those that don't support multioutput will warn with "DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel()."
It's expected that these ones will output (n_samples,), but for example the RandomForest estimators support multioutput and still ravel the predictions, unlike PLSRegression and LinearRegression.

For ref, the our regressors checks ensure that y_pred.shape == y.shape and our classifiers checks ensure that y_pred.shape == (n_samples,), but because of the data that these checks use, neither of them can detect such discrepancies.

glemaitre

I think @NicolasHug is right. We should however open a new issue and make sure to correct the common test as well.

sklearn/cross_decomposition/_pls.py

sklearn/cross_decomposition/tests/test_pls.py

sklearn/impute/tests/test_impute.py

doc/whats_new/v1.0.rst

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>

…er_plsregression

sklearn/cross_decomposition/_pls.py

…er_plsregression

glemaitre · 2021-07-23T22:17:48Z

Uhm so apparently, we have some 1D array that is coming before the squeeze. So it seems that we will need to have if Ypred.ndim == 2 condition then

cmarmo · 2021-11-18T21:28:52Z

Hi @polocorona, thanks for your patience! Do you mind synchronizing with upstream to rerun the checks? Also please, could you move the changelog entry to 1.0.2?
In fact you already have two approvals now. @glemaitre is this mergeable once synchronized? Thanks!

glemaitre · 2021-11-22T13:41:29Z

It is not straightforward indeed. It all depends if we agree to introduce a backwards-incompatible change.
I opened #20603 to check how many of these regressors we have issues with. Since that there are a couple of them, it might be advisable to have a deprecation cycle and make the IterativeImputer work with both cases meanwhile. I think that we should add this item to be discussed in the next developer meeting.

Fix IterativeImputer to work with PLSRegression as estimator

ffe7639

github-actions bot added the module:impute label Jun 25, 2021

Add entry in changelog for PR 20355

f2ec9b2

ghost mentioned this pull request Jun 25, 2021

Interactive Imputer cannot accept PLSRegression() as an estimator due to "shape mismatch" #19352

Open

Modify n_components in PLSRegression for test_imputation_shape

7a2d5f3

NicolasHug reviewed Jun 25, 2021

View reviewed changes

Apply PR review changes

350cbd3

ghost requested a review from NicolasHug June 25, 2021 20:04

Fix title underlining error

72c168c

NicolasHug reviewed Jun 28, 2021

View reviewed changes

doc/whats_new/v1.0.rst Outdated Show resolved Hide resolved

sklearn/impute/tests/test_impute.py Show resolved Hide resolved

ghost changed the title ~~[MRG] Fix IterativeImputer to work with PLSRegression as estimator~~ [MRG] Fix _PLS class to properly return 1-d prediction array for single-target prediction Jun 28, 2021

leopoloc0 added 3 commits June 28, 2021 12:26

Add code review suggestions on PR #20355

06a8b3e

Add PLS estimators missing tests for single-target predictions shapes

aad441a

Fix n_components used in test

1709208

ghost requested a review from NicolasHug June 29, 2021 03:53

NicolasHug approved these changes Jun 29, 2021

View reviewed changes

doc/whats_new/v1.0.rst Outdated Show resolved Hide resolved

sklearn/cross_decomposition/tests/test_pls.py Show resolved Hide resolved

Add minor changes requested to improve doc wording

078825b

jjerphan reviewed Jun 29, 2021

View reviewed changes

sklearn/cross_decomposition/_pls.py Outdated Show resolved Hide resolved

Add test to verify single observation prediction behavior

d3f1d8f

ghost requested a review from jjerphan June 29, 2021 16:19

jjerphan approved these changes Jun 29, 2021

View reviewed changes

ogrisel reviewed Jul 6, 2021

View reviewed changes

glemaitre changed the title ~~[MRG] Fix _PLS class to properly return 1-d prediction array for single-target prediction~~ FIX ensure to return 1-D array with single target in PLS* and CCA Jul 22, 2021

glemaitre reviewed Jul 22, 2021

View reviewed changes

sklearn/cross_decomposition/_pls.py Outdated Show resolved Hide resolved

sklearn/cross_decomposition/tests/test_pls.py Show resolved Hide resolved

sklearn/impute/tests/test_impute.py Show resolved Hide resolved

doc/whats_new/v1.0.rst Outdated Show resolved Hide resolved

leopoloc0 and others added 2 commits July 22, 2021 19:45

Update sklearn/cross_decomposition/_pls.py

380f8bb

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>

Merge remote-tracking branch 'upstream/main' into fix_iterative_imput…

6f57313

…er_plsregression

ghost commented Jul 23, 2021

View reviewed changes

sklearn/cross_decomposition/_pls.py Outdated Show resolved Hide resolved

leopoloc0 added 2 commits July 23, 2021 16:21

Add review suggestions

7e11f7c

Merge remote-tracking branch 'upstream/main' into fix_iterative_imput…

1fb6aab

…er_plsregression

ghost requested a review from glemaitre July 23, 2021 21:27

glemaitre mentioned this pull request Jul 25, 2021

TST common test for predictions shape consistency with single target #20603

Draft

Update _pls.py

2ae04e6

cmarmo modified the milestone: 1.0.2 Nov 18, 2021

cmarmo added the Needs Decision Requires decision label Nov 22, 2021

glemaitre removed their request for review December 16, 2021 17:01

ghost closed this by deleting the head repository Aug 14, 2023

This pull request was closed.

Uh oh!

Conversation

ghost commented Jun 25, 2021 • edited by glemaitre Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

NicolasHug left a comment

Choose a reason for hiding this comment

Uh oh!

ghost commented Jun 25, 2021

Uh oh!

NicolasHug left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

NicolasHug commented Jun 28, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NicolasHug left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jjerphan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jjerphan left a comment

Choose a reason for hiding this comment

Uh oh!

ghost commented Jun 30, 2021

Uh oh!

jjerphan commented Jun 30, 2021

Uh oh!

ogrisel left a comment

Choose a reason for hiding this comment

Uh oh!

ogrisel Jul 6, 2021

Choose a reason for hiding this comment

Uh oh!

NicolasHug Jul 6, 2021

Choose a reason for hiding this comment

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

glemaitre commented Jul 23, 2021

Uh oh!

cmarmo commented Nov 18, 2021

Uh oh!

glemaitre commented Nov 22, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

ghost commented Jun 25, 2021 •

edited by glemaitre

Loading

NicolasHug commented Jun 28, 2021 •

edited

Loading

NicolasHug left a comment •

edited

Loading