Skip to content

Add svd_flip (#6599)#6613

Merged
TomAugspurger merged 1 commit intodask:masterfrom
eric-czech:svd_flip
Sep 10, 2020
Merged

Add svd_flip (#6599)#6613
TomAugspurger merged 1 commit intodask:masterfrom
eric-czech:svd_flip

Conversation

@eric-czech
Copy link
Copy Markdown
Contributor

@eric-czech eric-czech commented Sep 9, 2020

  • Tests added / passed
  • Passes black dask / flake8 dask

This adds the function mentioned in #6599.

Notes on it:

  • Sign correction in SVD is a heuristic process though there seems to be some consensus that an idealized process for it identifies eigenvector directions that point in the direction of data vectors [1] (this gets cited in a lot of stack exchanges / github issues)
  • The "use the sign of the largest absolute value in a singular vector" approach taken in scikit-learn seems to be fairly common and is convenient when you want deterministic but arbitrary behavior (i.e. a data-independent heuristic)
  • I don't think this needs to be a meaningful correction, just a deterministic one
  • The scikit-learn svd_flip function allows for the correction to be based on U or V and as best I can tell, this is only relevant when you're running SVD on transposed inputs in the first place. I don't see any reason to need that and it appears to be used rarely so I simply left out the correct-based-on-V option (instead of U).
  • The trivial dask implementation of sklearn.svd_flip replaces np.sign(u[max_abs_cols, range(u.shape[1])]) with np.sign(u.vindex[max_abs_cols, range(u.shape[1])]) where vindex is used for fancy-indexing. I found this to be unacceptably slow though. Instead, a similarly arbitrary but more efficient correction would be to make sure all singular vectors fall in the same half-space as any arbitrarily chosen vector. I can't find an authoritative citation for this but it's kind of a common sense approach. Here is a notebook that compares the two approaches on equally sized short-fat or tall-skinny arrays and contains benchmark results that look like this: Screen Shot 2020-09-09 at 1 23 42 PM
    • svd_flip1 = scikit-learn svd_flip adapted to dask arrays
    • svd_flip2 = the implementation in this PR

Interestingly, the transposition needed to make SVD work for short-fat arrays (#6591) has almost no effect at all. The scikit-learn correction using vindex on dask arrays almost doubles the time it takes for the whole SVD + correction routine to run but this approach using a matrix-vector multiplication instead (as a sum across rows) accounts for a fairly negligible increase in time taken.

tl;dr This method is efficient but I still don't know quite where to put it. I think I would prefer if it was used based on an option in the linalg.svd signature set to True by default though that's not going to make sense until #3576 is done. Also, it will mean that the setting full_matrices=True, correct_signs=True isn't a valid one. For now though, I put this in array.utils and the test in test_linalg.py which is a little weird, but it seemed to make the most sense there alongside the other svd tests.

@eric-czech eric-czech force-pushed the svd_flip branch 2 times, most recently from 1f3e58c to d6a0b4a Compare September 9, 2020 19:07
@eric-czech
Copy link
Copy Markdown
Contributor Author

eric-czech commented Sep 10, 2020

Another reason this could be useful is that singular vectors currently have different signs depending on the chunking:

import numpy as np
import dask.array as da
rs = np.random.RandomState(1)
x = rs.random(size=(8, 4))

u, s, v = da.linalg.svd(da.asarray(x, chunks=(4, 4)))
print(u.compute().round(1))
[[ 0.3 -0.1  0.2 -0.4]
 [ 0.1  0.3 -0.3 -0.2]
 [ 0.3  0.4 -0.5  0.3]
 [ 0.4  0.4  0.5 -0.4]
 [ 0.3 -0.2 -0.1  0.3]
 [ 0.5  0.  -0.3 -0.3]
 [ 0.4 -0.7 -0.1  0.1]
 [ 0.3  0.2  0.5  0.6]]

u, s, v = da.linalg.svd(da.asarray(x, chunks=(1, 4)))
print(u.compute().round(1))
[[-0.3  0.1 -0.2 -0.4]
 [-0.1 -0.3  0.3 -0.2]
 [-0.3 -0.4  0.5  0.3]
 [-0.4 -0.4 -0.5 -0.4]
 [-0.3  0.2  0.1  0.3]
 [-0.5 -0.   0.3 -0.3]
 [-0.4  0.7  0.1  0.1]
 [-0.3 -0.2 -0.5  0.6]]

This would also come up with #6616 in chunked vs not-chunked results.

In any case, if/when this goes through I would happily integrate this in dask.svd as a part of #6616. Let me know if you guys have any concerns or objections @mrocklin / @TomAugspurger.

Update

FYI svd_compressed also has this same behavior:

import numpy as np
import dask.array as da
rs = np.random.RandomState(1)
x = rs.random(size=(8, 4))

u, s, v = da.linalg.svd_compressed(da.asarray(x, chunks=(6, 3)), k=4, n_power_iter=10, compute=True, seed=1)
print(u.compute().round(1))
[[ 0.3  0.1 -0.3  0.5]
 [ 0.1 -0.3  0.4 -0.4]
 [ 0.3 -0.4  0.5 -0.1]
 [ 0.4 -0.4 -0.5 -0.1]
 [ 0.3  0.2  0.   0.3]
 [ 0.5 -0.   0.3  0.4]
 [ 0.4  0.7  0.1 -0.5]
 [ 0.3 -0.2 -0.5 -0.4]]

u, s, v = da.linalg.svd_compressed(da.asarray(x, chunks=(6, 4)), k=4, n_power_iter=10, compute=True, seed=1)
print(u.compute().round(1))
[[-0.3 -0.1 -0.2  0.9]
 [-0.1  0.3  0.3  0.1]
 [-0.3  0.4  0.5  0.2]
 [-0.4  0.4 -0.5 -0.2]
 [-0.3 -0.2  0.1  0. ]
 [-0.5  0.   0.4 -0.3]
 [-0.4 -0.7  0.1 -0.2]
 [-0.3  0.2 -0.5 -0.2]]

@TomAugspurger
Copy link
Copy Markdown
Member

Thanks for the PR and the analysis. The changes here look great.

@eric-czech when do you think a hypothetical correct_signs=False would be useful? Just for performance?

@eric-czech
Copy link
Copy Markdown
Contributor Author

eric-czech commented Sep 10, 2020

@eric-czech when do you think a hypothetical correct_signs=False would be useful? Just for performance?

That's about the only good reason I can think of. Hypothetically, it might also be useful to turn off if somebody wanted to try to support the full_matrices=True behavior like np.linalg.svd does since this sign correction wouldn't handle that. The sklearn svd_flip function breaks in that case too (like we saw in dask/dask-ml#732), but I can see how this svd_flip function could be extended to work in that scenario if it's ever necessary.

EDIT

Oh also there is a chance that somebody might like to run SVD on transposed inputs as some minor performance improvement too (to avoid a transposition as they do in scikit-learn), but as far as I know there is no reason that they couldn't simply transpose the inputs and use the right vectors instead of the left vectors. I believe that's why the u_based_decision flag exists in sklearn.svd_flip, and the equivalent here would be to set correct_signs=False and then run svd_flip yourself with u_based_decision=False. That's one more potential reason for it but I would also call that a performance concern.

@TomAugspurger
Copy link
Copy Markdown
Member

OK, thanks! I don't have a strong opinion, but right now my preferences align with yours (on be default, with a keyword to disable).

Let's revisit that once the other PRs and full_matrices issues are resolved.

@TomAugspurger TomAugspurger merged commit 3327b2e into dask:master Sep 10, 2020
kumarprabhu1988 pushed a commit to kumarprabhu1988/dask that referenced this pull request Oct 29, 2020
Co-authored-by: Eric Czech <eric@related.vc>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants