In current KernelPCA _fit_transform implementation, nothing prevents or alerts users that the eigenvalue decomposition presents some numerical or conditioning issue. We could check the following (thanks https://github.com/GabrielRilling for the suggestion!):
- significant imaginary parts in eigenvalues (raise
ValueError)
- significant negative eigenvalues (throw
KernelWarning if there is at least a positive eigenvalue, otherwise raise ValueError)
- significant conditioning issues (huge ratio > 1e12 between large and small eigenvalues) (throw
KernelWarning)
We should also perform some cleaning for non-significant issues (due to numerical approximation) and for the above when no error is raised:
- remove unsignificant imaginary parts
- set negative eigenvalues to zero
- set extremely small eigenvalues (with respect to the largest ones) to zero
This will provide more robust and stable numerical computation across runs/platforms/noise.