-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Hello!
I'm kicking the tires on Dask lately as it may be quite useful for me (thanks for the tip, T.J.), but have a question that doesn't seem to be answered by the documentation or the code (that I can find).
Suppose I want to perform some linear algebra operation (e.g. dask.array.dot) on a very large array that is stored in many CSV files. Is this currently possible?
I know that I can use dask.dataframe to create a dag for reading the CSV to a dataframe, but as far as I can tell there is no way to convert the dataframe to an array on which I can perform the inner product.
I also know that I can create a dag for reading in arrays directly from (for example) an hdf5 file, but that is not how the data are stored.
If this isn't currently possible, I'd be happy to work on solving the problem, but wanted to makes sure I wasn't reinventing the wheel.
Thanks!
Scott