-
-
Notifications
You must be signed in to change notification settings - Fork 26.9k
cross_decomposition module needs work #4122
Copy link
Copy link
Closed
Labels
Description
Recently I've spent some time exploring cross_decomposition module (I haven't paid attention to CCA). I noticed several problems with it:
- The module needs much better narrative documentation. Now it's not clear what algorithms actually do, where they might be useful and how they are different.
- The layout of files and import approach are unusual for sklearn. The files are named with trailing underscores,
__all__is put into individual files, but not in__init__.py. - The implementation is rather obscure (for short and conventional description of PLS 2 refer here).
-
_nipals_twoblocks_inner_loopin fact computes both weights and scores, but only weights are returned, and scores are then recomputed. -
Parameternorm_y_weightscrucially determines how the algorithm works (along withdeflation_modeandmode), but does it in tricky way. It took me a lot of time to understand why implemented herePLSRegressionis equivalent to PLS 2 algorithm described in most sources.
Parameternorm_y_weightsset toFalseforPLSRegressionin order to make a regression coefficient$\hat{c}$ (see link above) between$t$ and$u$ equal to 1. It is done purely by convention and to compare with implementations in R (I suppose). - Alternative
_svd_cross_productsolver is provided, but never used in code (only in tests). Also It can't be a substitution for_nipals_twoblocks_inner_loopwithnorm_y_weights=False.
-
- There are no required early stop checks. When
XorYmatrix are deflated to zeros, iterations become invalid and assumed properties are no longer held. That causes the bug bug in PLSRegression() when one of the columns in X is constant #3932. - Computed
y_rotations_is not correct forPLSRegression, i. e.y_scores_ != np.dot(Yc, y_rotations_). I think there is no way to compute these rotations in this case, becauseYis deflated onx_scores - Several obvious bugs such as modifying non-existing variable, etc.
Reactions are currently unavailable