Skip to content

new feature: add LOBPCG solver to Truncated PCA #12080

@lobpcg

Description

@lobpcg

Description

The code https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/decomposition/truncated_svd.py currently supports only

algorithm : string, default = "randomized"
SVD solver to use. Either "arpack" for the ARPACK wrapper in SciPy
(scipy.sparse.linalg.svds), or "randomized" for the randomized
algorithm due to Halko (2009).

while LOBPCG https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.lobpcg.html is already used in http://scikit-learn.org/stable/modules/generated/sklearn.manifold.SpectralEmbedding.html and http://scikit-learn.org/stable/modules/generated/sklearn.cluster.spectral_clustering.html

lobpcg_svd solver has been added in #12079

LOBPCG solver added to Truncated PCA in
https://github.com/lobpcg/scikit-learn/commit/ec61eef358cc8b9b6553cdeedd6b809246e9c716

Steps/Code to Reproduce

from sklearn.decomposition import TruncatedSVD
from sklearn.random_projection import sparse_random_matrix
X = sparse_random_matrix(100, 100, density=0.01, random_state=42)
svd = TruncatedSVD(n_components=5, n_iter=7, random_state=42)
svd.fit(X)
print(svd.explained_variance_ratio_)
print(svd.explained_variance_ratio_.sum())
print(svd.singular_values_)
svd = TruncatedSVD(algorithm='lobpcg', n_components=5, n_iter=7, random_state=42)
svd.fit(X)
print(svd.explained_variance_ratio_)
print(svd.explained_variance_ratio_.sum())
print(svd.singular_values_)

Expected Results

LOBPCG is expected to outperform both ARPACK and "randomized" solvers for Truncated PCA for large problems, e.g., see comments at https://www.mathworks.com/matlabcentral/fileexchange/48-lobpcg-m

Actual Results

Not supported

Versions

All

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions