FIX DOC MNT Big revamp of cross-decomposition module by NicolasHug · Pull Request #17095 · scikit-learn/scikit-learn

NicolasHug · 2020-04-30T20:02:33Z

Closes #4122
Closes #8392
Probably Fixes #4469, though I can't reproduce.
Fixes #11645
Closes #13521
Closes #16177

This PR is a rework of the cross-decomposition module which had been left for dead for years. Mainly, docs and tests were added, and code was simplified. This hopefully comes with a few bug fixes.

This is a big PR, but it's a lot of docs. You might want to ignore the diff and just review the files from scratch, given the amount of changes. The good news is that there are docs now, so you're off to a much better start than I was.

Other stuff:

use 1d shapes for vectors instead of 2d shapes. Makes outer product more obvious.
for PLSSVD, CCA, PLSCanonical n_components now raises a FutureWarning if it's not in [1, min(n_samples, n_features, n_targets)]. This is only for backward compat, and an error will be raised in 2 versions: n_components cannot be greater than the rank of the cross-covariance matrix = X.T.dot(Y) which is bounded as above. For PLSRegression, the rank is bounded by n_features. See comments in code.
PLSSVD, CCA and PLSCanonical:
- deprecated x_scores and y_scores attributes since they're useless: they're just the transformed matrices of X and Y. For PLSRegression, y_scores are different, so I didn't deprecate (But I doubt these are useful anyway)

…smaybe

…_PLS_qui_porte_si_bien_son_nom

NicolasHug · 2020-05-04T20:50:13Z

CC @thomasjpfan

@amueller you have time for reviews now, right? Welcome back!! :p

TomDLT

Nice cleaning !

Do we need a test for #11645 ?

sklearn/cross_decomposition/_pls.py

doc/modules/cross_decomposition.rst

sklearn/cross_decomposition/_pls.py

Co-authored-by: Tom Dupré la Tour <tom.dupre-la-tour@m4x.org>

…_PLS_qui_porte_si_bien_son_nom

…Hug/scikit-learn into la_PLS_qui_porte_si_bien_son_nom

…_PLS_qui_porte_si_bien_son_nom

doc/modules/cross_decomposition.rst

agramfort · 2020-08-31T12:35:00Z

doc/modules/cross_decomposition.rst

-Classes included in this module are :class:`PLSRegression`
+Apart from CCA, the PLS estimators are particularly suited when the matrix of
+predictors has more variables than observations, and when there is
+multicollinearity among the features. By contrast, standard linear regression


The fact that it fixes the multicollinearity issue while linear regression would fail unless regularization seems an authoritative argument here. Do we have a ref or example that demo this?

When features are collinear, the covariance matrix is ill-defined and thus non-invertible.

I don't have a ref at hand, but I'm assuming this is common knowledge (could be wrong?)

are you referring to this https://github.com/scikit-learn/scikit-learn/pull/17095/files#diff-df97917f68917d3a110df30940d771dfR176 ? is it a statement of PLS vs CCA rather than PLS vs linear regression. Maybe I am nitpicking here.

It's mostly a statement about PLS vs non-regularized LR: LR is unstable when there is collinearity among features, PLS is not. However, CCA is unstable too (as described in the link you mentioned).

doc/modules/cross_decomposition.rst

…_PLS_qui_porte_si_bien_son_nom

agramfort

thx @NicolasHug for clarifying

NicolasHug · 2020-08-31T13:36:43Z

Thanks @TomDLT and @agramfort for the reviews!

Let me push a what'snew entry, and I'll merge when green

NicolasHug · 2020-08-31T16:38:55Z

Docs look good, merging.
Thanks for the reviews!

)

For more, see scikit-learn/scikit-learn#17095

hendriklohse · 2024-07-10T10:42:36Z

@NicolasHug I have a question about the documentation. What exactly does the attribute x_loadings_ represent? The correlation, or the coefficients of the linear combinations used to transform X? (See also https://stackoverflow.com/questions/78725061/what-does-the-attribute-x-loadings-represent?noredirect=1#comment138800857_78725061)

NicolasHug added 2 commits April 30, 2020 15:51

Simplified array shapes

fc3e864

Merge branch 'master' of github.com:scikit-learn/scikit-learn into pl…

1edf5e8

…smaybe

github-actions bot added the module:cross_decomposition label Apr 30, 2020

NicolasHug added 13 commits April 30, 2020 19:14

improved and cleaned tests

e322027

Cleaned up PLSSVD

e3143af

Some cleanings

6b4cd5c

Some more cleaning

782ac56

Correct bounds for n_components

5473f93

should fix warnings

f03a82e

Merge branch 'master' of github.com:scikit-learn/scikit-learn into la…

b10c543

…_PLS_qui_porte_si_bien_son_nom

removed norm_y_weight

152156a

WIP

c9e5a62

deprecated scores

845dfa3

WIP

828aa95

Merge branch 'master' of github.com:scikit-learn/scikit-learn into la…

9696848

…_PLS_qui_porte_si_bien_son_nom

some docs

b754b00

NicolasHug mentioned this pull request May 4, 2020

CCA UserWarning: X scores are null at iteration 0 #4469

Closed

NicolasHug added 3 commits May 4, 2020 15:28

some more

9c601b1

Merge branch 'master' of github.com:scikit-learn/scikit-learn into la…

b66b8f3

…_PLS_qui_porte_si_bien_son_nom

fixed docstring tests

1d0951e

NicolasHug changed the title ~~[WIP] Reworking of PLS~~ [MRG] Reworking of PLS May 4, 2020

NicolasHug changed the title ~~[MRG] Reworking of PLS~~ [MRG] Reworking and docs for cross-decomposition module May 4, 2020

TomDLT approved these changes May 4, 2020

View reviewed changes

NicolasHug and others added 6 commits May 4, 2020 18:34

Apply suggestions from code review

b7b58e5

Co-authored-by: Tom Dupré la Tour <tom.dupre-la-tour@m4x.org>

Merge branch 'master' of github.com:scikit-learn/scikit-learn into la…

802e93b

…_PLS_qui_porte_si_bien_son_nom

Merge branch 'la_PLS_qui_porte_si_bien_son_nom' of github.com:Nicolas…

1be0fbf

…Hug/scikit-learn into la_PLS_qui_porte_si_bien_son_nom

Addressed comments

627c469

added test for svd_flip_1d

5758db5

some more

5830dc9

NicolasHug mentioned this pull request May 7, 2020

[MRG] Add example comparing PCR and PLS #17151

Merged

thomasjpfan mentioned this pull request May 24, 2020

CCA documentation could be improved #16177

Closed

NicolasHug mentioned this pull request Jun 6, 2020

doc: Standarize default documentation for cross_decomposition #17471

Merged

This was referenced Jun 15, 2020

Roadmap LeonieBorne/plstuto#3

Open

Tutorial 0. Introduction LeonieBorne/plstuto#5

Open

NicolasHug added 3 commits July 21, 2020 06:58

Merge branch 'master' of github.com:scikit-learn/scikit-learn into la…

e3e907c

…_PLS_qui_porte_si_bien_son_nom

added to ignore docstring checks

c7f12f8

typo

82bbb3c

agramfort reviewed Aug 31, 2020

View reviewed changes

NicolasHug added 2 commits August 31, 2020 09:02

Merge branch 'master' of github.com:scikit-learn/scikit-learn into la…

fa5ad5b

…_PLS_qui_porte_si_bien_son_nom

Addressed comments from Alex

cbb652f

agramfort approved these changes Aug 31, 2020

View reviewed changes

NicolasHug added 3 commits August 31, 2020 10:31

Added whatsnew + minor cosmetics

742aa36

typo

ff5e5a1

fixed docs?

a01dc8e

cmarmo removed the Waiting for Reviewer label Aug 31, 2020

NicolasHug changed the title ~~[MRG] Reworking and docs for cross-decomposition module~~ FIX DOC MNT Big revamp of cross-decomposition module Aug 31, 2020

NicolasHug merged commit 8061aac into scikit-learn:master Aug 31, 2020

NicolasHug deleted the la_PLS_qui_porte_si_bien_son_nom branch August 31, 2020 16:38

jayzed82 pushed a commit to jayzed82/scikit-learn that referenced this pull request Oct 22, 2020

FIX DOC MNT Big revamp of cross-decomposition module (scikit-learn#17095

ceae614

)

JanetMatsen mentioned this pull request Jan 12, 2021

Integration with sklearn? gallantlab/pyrcca#10

Open

markotoplak added a commit to markotoplak/orange-spectroscopy-fork that referenced this pull request Feb 16, 2021

PLS: adapt n_components for scikit-learn 0.24.0

6ea42f6

For more, see scikit-learn/scikit-learn#17095

markotoplak added a commit to markotoplak/orange-spectroscopy-fork that referenced this pull request Feb 16, 2021

PLS: adapt n_components for scikit-learn 0.24.0

e0a84fb

For more, see scikit-learn/scikit-learn#17095

markotoplak mentioned this pull request Feb 16, 2021

[MNT] Adapt PLS for scikit-learn 0.24.0 Quasars/orange-spectroscopy#517

Merged

thomasjpfan mentioned this pull request Apr 6, 2021

PLSRegression fails to fit some data with StopIteration #19831

Closed

JakobRiber mentioned this pull request Aug 20, 2021

PLSRegression doesn't work with Pipeline Transformer #20783

Closed

jeremiedbb mentioned this pull request Feb 1, 2022

MNT remove deprecated attributes in PLS #21585

Merged

Uh oh!

Conversation

NicolasHug commented Apr 30, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NicolasHug commented May 4, 2020

Uh oh!

TomDLT left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

agramfort Aug 31, 2020

Choose a reason for hiding this comment

Uh oh!

NicolasHug Aug 31, 2020

Choose a reason for hiding this comment

Uh oh!

agramfort Aug 31, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NicolasHug Aug 31, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

agramfort left a comment

Choose a reason for hiding this comment

Uh oh!

NicolasHug commented Aug 31, 2020

Uh oh!

NicolasHug commented Aug 31, 2020

Uh oh!

hendriklohse commented Jul 10, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

NicolasHug commented Apr 30, 2020 •

edited

Loading

agramfort Aug 31, 2020 •

edited

Loading