-
-
Notifications
You must be signed in to change notification settings - Fork 26.9k
new feature: add LOBPCG solver to Truncated PCA #12080
Description
Description
The code https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/decomposition/truncated_svd.py currently supports only
algorithm : string, default = "randomized"
SVD solver to use. Either "arpack" for the ARPACK wrapper in SciPy
(scipy.sparse.linalg.svds), or "randomized" for the randomized
algorithm due to Halko (2009).
while LOBPCG https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.lobpcg.html is already used in http://scikit-learn.org/stable/modules/generated/sklearn.manifold.SpectralEmbedding.html and http://scikit-learn.org/stable/modules/generated/sklearn.cluster.spectral_clustering.html
lobpcg_svd solver has been added in #12079
LOBPCG solver added to Truncated PCA in
https://github.com/lobpcg/scikit-learn/commit/ec61eef358cc8b9b6553cdeedd6b809246e9c716
Steps/Code to Reproduce
from sklearn.decomposition import TruncatedSVD
from sklearn.random_projection import sparse_random_matrix
X = sparse_random_matrix(100, 100, density=0.01, random_state=42)
svd = TruncatedSVD(n_components=5, n_iter=7, random_state=42)
svd.fit(X)
print(svd.explained_variance_ratio_)
print(svd.explained_variance_ratio_.sum())
print(svd.singular_values_)
svd = TruncatedSVD(algorithm='lobpcg', n_components=5, n_iter=7, random_state=42)
svd.fit(X)
print(svd.explained_variance_ratio_)
print(svd.explained_variance_ratio_.sum())
print(svd.singular_values_)
Expected Results
LOBPCG is expected to outperform both ARPACK and "randomized" solvers for Truncated PCA for large problems, e.g., see comments at https://www.mathworks.com/matlabcentral/fileexchange/48-lobpcg-m
Actual Results
Not supported
Versions
All