Amg arpack workaround fix by amueller · Pull Request #14647 · scikit-learn/scikit-learn

amueller · 2019-08-13T21:29:26Z

Fixes #10715.
See discussion here:
#10720 (comment)

Further discussion by @glemaitre and @lesteve and @sky88088 and @lobpcg here:
#10715 (comment)

This logic can probably be simplified further but this is a fix for an obvious logic bug.

Right now, in the first case of the "try", if it succeeds and eigen_solver=="amg", the result is discarded, a different (apparently less suitable, according to the heuristic) solver is used, and the laplacian was flipped.
This is clearly not the intention of the code. If the try succeeds, these results should be used.

FYI, the tolerances in the checks were way larger than any of the values.

amueller · 2019-08-13T21:36:26Z

Lol after debugging and fixing this and then reading #10715 it's obvious that @sky88088 diagnosed the problem very accurately in his initial post.

lobpcg · 2019-08-13T22:25:21Z

You may also want to completely remove the sign flip in the Laplacian

jnothman

The preceding if condition also looks dodgy. It is

(eigen_solver == 'arpack' or eigen_solver != 'lobpcg' and
       (not sparse.isspmatrix(laplacian) or n_nodes < 5 * n_components))

I think this is interpreted as

(
   (eigen_solver == 'arpack' or eigen_solver != 'lobpcg')
 and
   (not sparse.isspmatrix(laplacian) or n_nodes < 5 * n_components)):

If this is the correct interpretation then the first clause can be simplified to eigen_solver != 'lobpcg'

lobpcg · 2019-08-14T02:20:00Z

For speed and simplicity, you may want just to call a dense solver if n_nodes < 5 * n_components no matter what the other conditions are.

glemaitre · 2019-08-14T09:16:42Z

The conditions are really complicated to follow. Recalling and trusting my old self (probably I should not do that) #10715 (comment), we could simplify with something like:

if not sparse.isparse(laplacian) and eigen_solver == 'amg':
    # warns that we switch to arpack
    eigen_solver = 'arpack'

if eigen_solver == 'arpack':
    # solve the problem
    try:
        ...
    except RuntimeError:
        eigen_solver = 'lobcpg'

# from that point eigen_solver will be either amg or lobcpg
# check if we have enough n_nodes
if n_nodes < 5 * n_components:
    # use eigh
else:
    # check that we should precondition
    if eigen_solver == 'amg':
        # preconditionne and get M
    # call lobcpg with M if available otherwise go for default

Anyway, we should probably merge the regression test and later refactor the code to be readable.

glemaitre · 2019-08-14T09:21:32Z

sklearn/manifold/spectral_embedding_.py

            raise ValueError

-    elif eigen_solver == "lobpcg":
+    if eigen_solver == "lobpcg":


As a side note, codecov is reporting that lobpcg is never used our tests:

https://codecov.io/gh/scikit-learn/scikit-learn/compare/92af3dabbb5f3381a656f7727171f332b8928e05...247e9c6029c37a68b279c91908f650a9c0173a39/src/sklearn/manifold/spectral_embedding_.py#L318

amueller · 2019-08-14T14:56:18Z

@lobpcg dense solver meaning arpack?

amueller · 2019-08-14T14:59:52Z

@glemaitre your code has the same issue I'm fixing here. There is no return anywhere, so "# from that point eigen_solver will be either amg or lobcpg" is not true?

amueller · 2019-08-14T15:00:49Z

Ok I also really don't understand the sign flip tbh.

lobpcg · 2019-08-14T15:05:52Z

@lobpcg dense solver meaning arpack?

dense solver meaning scipy.linalg.eigh or numpy.linalg.eigh (unsure about the difference), also turning a sparse Laplacian into dense if needed. The point being that if n_nodes < 5 * n_components it it likely cheaper to compute all eigenvectors and drop some.

amueller · 2019-08-14T15:06:55Z

@lobpcg? Shouldn't the fallback for arpack just be the standard eigsh, not lobpcg? The benefits of shift-invert mode are not within my expertise, can you comment on the benefits of using that instead of using the standard eigsh?

lobpcg · 2019-08-14T15:11:38Z

Ok I also really don't understand the sign flip tbh.

laplacian *= -1 in several places makes no sense whatsoever to me.
There used to be a bug in lobpcg, now fixed, for small matrices requiring this, but not any more.

amueller · 2019-08-14T15:14:05Z

@lobpcg did you see the comment above the arpack magic? That explains what's happening. I don't think it's related to lobpcg.

amueller · 2019-08-14T15:19:16Z

There's a bunch of other issues here, but I think this PR fixes one obvious bug and adds one clear regression test and we should merge it. Then we can triage the other bugs.

amueller · 2019-08-14T15:24:56Z

@glemaitre would love to increase coverage but calling lobpcg actually fails, see #13393 (comment)

lobpcg · 2019-08-14T15:25:21Z

@lobpcg? Shouldn't the fallback for arpack just be the standard eigsh, not lobpcg? The benefits of shift-invert mode are not within my expertise, can you comment on the benefits of using that instead of using the standard eigsh?

For "small" matrices, scipy.sparse.linalg.eigsh is typically the best in all respects. For large matrices:

scipy.sparse.linalg.eigsh is the most expensive, scaling cubically, but never fails.
arpack shift and invert typically scales quadritically, and may fail and miss eigenvalues
lobpcg without amg scales linearly (with the number of nnz in the space matrix), but the convergence may be slow. The latest version [MRG] multiple stability updates in lobpcg scipy/scipy#10621 fixes stability issues
lobpcg with amg should scale linearly and converge very fast in most cases, but needs Fix for spectral clustering error when using 'amg' solver #13707 to be merged and also may need [MRG] multiple stability updates in lobpcg scipy/scipy#10621

amueller · 2019-08-14T15:28:15Z

(4 should be with amg I think ;)

amueller · 2019-08-14T15:39:18Z

@jnothman and binds stronger than or so it's parsed as

    if (eigen_solver == 'arpack' or (eigen_solver != 'lobpcg' and
       ((not sparse.isspmatrix(laplacian)) or n_nodes < 5 * n_components)):

lobpcg · 2019-08-14T15:49:59Z

(4 should be with amg I think ;)

thanks, fixed

@lobpcg did you see the comment above the arpack magic? That explains what's happening. I don't think it's related to lobpcg.

Do you mean

            laplacian *= -1
            v0 = random_state.uniform(-1, 1, laplacian.shape[0])
            _, diffusion_map = eigsh(
                laplacian, k=n_components, sigma=1.0, which='LM',
                tol=eigen_tol, v0=v0)

It's just a bad way to call arpack shift-and-invert in this case. A good way is just to change sigma, replacing the above with something like

            v0 = random_state.uniform(-1, 1, laplacian.shape[0])
            _, diffusion_map = eigsh(
                laplacian, k=n_components, sigma=-1e-5, which='LM',
                tol=eigen_tol, v0=v0)

to find the eigenvalues closest to zero. You want sigma to be a bit negative, to avoid LU factorization failures, so that arpack never fails. Above I use a safe choice sigma=-1e-5 that should work in single precision as well (if arpack supports it). If you make sigma even more negative, it should slow down the convergence a bit.

amueller · 2019-08-16T16:09:08Z

I'd still like to see this merged before we go into rewriting the logic.

glemaitre

+1 for introducing the regression test first

jnothman

Please

add what's new
Open issue to finish this clean up

amueller · 2019-08-21T18:55:56Z

addes whatsnew, opened #14713

# Conflicts: # doc/whats_new/v0.22.rst

ogrisel · 2019-08-29T08:54:46Z

sklearn/manifold/tests/test_spectral_embedding.py

    embed_amg = se_amg.fit_transform(S)
    embed_arpack = se_arpack.fit_transform(S)
-    assert _check_with_col_sign_flipping(embed_amg, embed_arpack, 0.05)
+    assert _check_with_col_sign_flipping(embed_amg, embed_arpack, 0.1e-4)


Suggested change

assert _check_with_col_sign_flipping(embed_amg, embed_arpack, 0.1e-4)

assert _check_with_col_sign_flipping(embed_amg, embed_arpack, 1e-5)

ogrisel · 2019-08-29T08:59:13Z

sklearn/manifold/tests/test_spectral_embedding.py

+    se_arpack.affinity = "precomputed"
+    embed_amg = se_amg.fit_transform(affinity)
+    embed_arpack = se_arpack.fit_transform(affinity)
+    assert _check_with_col_sign_flipping(embed_amg, embed_arpack, 0.1e-4)


Suggested change

assert _check_with_col_sign_flipping(embed_amg, embed_arpack, 0.1e-4)

assert _check_with_col_sign_flipping(embed_amg, embed_arpack, 1e-5)

ogrisel

LGTM. Will merge.

amueller added 2 commits August 13, 2019 17:21

this is evil

5efe727

use a meaningful tolerance

f021283

amueller added the Bug label Aug 13, 2019

amueller added 2 commits August 13, 2019 17:29

remove print statement

ee2057a

pep8

247e9c6

amueller mentioned this pull request Aug 13, 2019

A mistake about spectral clustering with amg solver #10715

Closed

jnothman reviewed Aug 13, 2019

View reviewed changes

glemaitre reviewed Aug 14, 2019

View reviewed changes

remove unused lambdas

b35085a

whoops

320506d

glemaitre approved these changes Aug 19, 2019

View reviewed changes

jnothman approved these changes Aug 19, 2019

View reviewed changes

whatsnew

486b3c4

amueller mentioned this pull request Aug 21, 2019

Simplify spectral clustering solver logic #14713

Open

amueller added 2 commits August 21, 2019 14:56

Merge branch 'master' into amg_arpack_workaround_fix

d228d8d

# Conflicts: # doc/whats_new/v0.22.rst

Merge branch 'master' into amg_arpack_workaround_fix

807a744

ogrisel reviewed Aug 29, 2019

View reviewed changes

ogrisel approved these changes Aug 29, 2019

View reviewed changes

ogrisel merged commit c52b6e1 into scikit-learn:master Aug 29, 2019

ogrisel mentioned this pull request Aug 29, 2019

[MAINT] Post #14647 cleanups #14840

Merged

ogrisel added a commit that referenced this pull request Aug 29, 2019

[MAINT] Post #14647 cleanups (#14840)

00ce061

ogrisel mentioned this pull request Jan 3, 2020

test_spectral_embedding_amg_solver_failure random failure #16011

Closed

cmarmo mentioned this pull request Dec 15, 2020

[MRG] change spectral embedding eigen solver from amg to arpack #10720

Closed

This was referenced Feb 12, 2026

The eigsh call in spectral embedding is done wrong and differently from what is described in the comments #33242

Open

ENH Fix the eigsh call in spectral embedding #33262

Open

	assert _check_with_col_sign_flipping(embed_amg, embed_arpack, 0.1e-4)
	assert _check_with_col_sign_flipping(embed_amg, embed_arpack, 1e-5)

Uh oh!

Conversation

amueller commented Aug 13, 2019

Uh oh!

amueller commented Aug 13, 2019

Uh oh!

lobpcg commented Aug 13, 2019

Uh oh!

jnothman left a comment

Choose a reason for hiding this comment

Uh oh!

lobpcg commented Aug 14, 2019

Uh oh!

glemaitre commented Aug 14, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

glemaitre Aug 14, 2019

Choose a reason for hiding this comment

Uh oh!

amueller commented Aug 14, 2019

Uh oh!

amueller commented Aug 14, 2019

Uh oh!

amueller commented Aug 14, 2019

Uh oh!

lobpcg commented Aug 14, 2019

Uh oh!

amueller commented Aug 14, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lobpcg commented Aug 14, 2019

Uh oh!

amueller commented Aug 14, 2019

Uh oh!

amueller commented Aug 14, 2019

Uh oh!

amueller commented Aug 14, 2019

Uh oh!

lobpcg commented Aug 14, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

amueller commented Aug 14, 2019

Uh oh!

amueller commented Aug 14, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lobpcg commented Aug 14, 2019

Uh oh!

amueller commented Aug 16, 2019

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

jnothman left a comment

Choose a reason for hiding this comment

Uh oh!

amueller commented Aug 21, 2019

Uh oh!

ogrisel Aug 29, 2019

Choose a reason for hiding this comment

Uh oh!

ogrisel Aug 29, 2019

Choose a reason for hiding this comment

Uh oh!

ogrisel left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

glemaitre commented Aug 14, 2019 •

edited

Loading

amueller commented Aug 14, 2019 •

edited

Loading

lobpcg commented Aug 14, 2019 •

edited

Loading

amueller commented Aug 14, 2019 •

edited

Loading