-
-
Notifications
You must be signed in to change notification settings - Fork 26.9k
KernelPCA: fit_transform and transform methods are inconsistent in case of zero eigenvalues #12141
Copy link
Copy link
Closed
Description
In current implementation of KernelPCA, when there are zero eigenvalues that are not removed (remove_zero_eig=False), then fit_transform and fit + transform methods lead to inconsistent results.
- when
fit_transformis run, there is an optimization code that does not recompute the gram matrix. Therefore the transformed X isX_transformed = self.alphas_ * np.sqrt(self.lambdas_)here - when
transformis run, of course it cannot use the same shortcut and therefore the gram matrix is recomputed and a dot product is performed here. Because the eigenvectorsself.alphas_are not saved in a scaled version (I do not know why), they are scaled just here, by dividing by the square root of the eigenvaluesnp.sqrt(self.lambdas_). When eigenvalues are zero, this leads to infinite values in the eigenvectors which after the dot product may result in infinite values or nan values.
To fix this issue, I guess that we should not scale the eigenvectors when the eigenvalue is zero.
There are two ways to do this:
- either do this in the
transformmethod for example by
def transform(self, X):
"""Transform X.
Parameters
----------
X : array-like, shape (n_samples, n_features)
Returns
-------
X_new : array-like, shape (n_samples, n_components)
"""
check_is_fitted(self, 'X_fit_')
# Compute centered gram matrix between X and training data X_fit_
K = self._centerer.transform(self._get_kernel(X, self.X_fit_))
# scale eigenvectors
scaled_alphas = self.alphas_ / np.sqrt(self.lambdas_)
# properly take null-space into account for the dot product
scaled_alphas[:, self.lambdas_ == 0] = 0
# Project by doing a scalar product between K and the scaled eigenvects
return np.dot(K, scaled_alphas)- or we could scale the
self.alphas_directly when they are created (at the end of_fit). In which case we would need to adaptfit_transformandtransform.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels