-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Closed
Description
The current SVD implementation has very unintuitive performance characteristics for small arrays:
import dask.array as da
# Create short-fat (small) single-chunk array
x = da.random.random((100, 100000), chunks=(-1, -1))
x.nbytes # 80000000 = 80MB
# Despite x being 80MB, this uses ~80 **GB** of RAM and
# I have no idea how long it would take to run (at least more than 5 minutes)
da.linalg.svd(x)[0].mean().compute()
# This on the other hand runs almost instantly and barely uses any memory
da.linalg.svd(x.T)[2].T.mean().compute() # the U array is V.T when given x.TWith #6591 in place, it would make sense to identify single chunk arrays and transpose them first to avoid this. That still wouldn't stop a user from providing an array with weird chunking, but I think a special case for single-chunk arrays is appropriate.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels