KernelPCA: `fit_transform` and `transform` methods are inconsistent in case of zero eigenvalues

In current implementation of KernelPCA, when there are zero eigenvalues that are not removed (`remove_zero_eig=False`), then `fit_transform` and `fit` + `transform` methods lead to inconsistent results.

 * when `fit_transform` is run, there is an optimization code that does not recompute the gram matrix. Therefore the transformed X is `X_transformed = self.alphas_ * np.sqrt(self.lambdas_)` [here](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/decomposition/kernel_pca.py#L278)
 * when `transform` is run, of course it cannot use the same shortcut and therefore the gram matrix is recomputed and a dot product is performed [here](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/decomposition/kernel_pca.py#L299). Because the eigenvectors `self.alphas_` are not saved in a scaled version (I do not know why), they are scaled just here, by dividing by the square root of the eigenvalues `np.sqrt(self.lambdas_)`. When eigenvalues are zero, **this leads to infinite values in the eigenvectors** which after the dot product may result in infinite values or nan values.

To fix this issue, I guess that we should not scale the eigenvectors when the eigenvalue is zero.

There are two ways to do this:
 - either do this in the `transform` method for example by 

```python
def transform(self, X):
        """Transform X.

        Parameters
        ----------
        X : array-like, shape (n_samples, n_features)

        Returns
        -------
        X_new : array-like, shape (n_samples, n_components)
        """
        check_is_fitted(self, 'X_fit_')

        # Compute centered gram matrix between X and training data X_fit_
        K = self._centerer.transform(self._get_kernel(X, self.X_fit_))

        # scale eigenvectors
        scaled_alphas = self.alphas_ / np.sqrt(self.lambdas_)

        # properly take null-space into account for the dot product
        scaled_alphas[:, self.lambdas_ == 0] = 0

        # Project by doing a scalar product between K and the scaled eigenvects
        return np.dot(K, scaled_alphas)
```

 - or we could scale the `self.alphas_` directly when they are created (at the end of `_fit`). In which case we would need to adapt `fit_transform` and `transform`.




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

KernelPCA: `fit_transform` and `transform` methods are inconsistent in case of zero eigenvalues #12141

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

KernelPCA: fit_transform and transform methods are inconsistent in case of zero eigenvalues #12141

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

KernelPCA: `fit_transform` and `transform` methods are inconsistent in case of zero eigenvalues #12141