Skip to content

Feature Inquiry/Request #887

@sscondie

Description

@sscondie

Hello!

I'm kicking the tires on Dask lately as it may be quite useful for me (thanks for the tip, T.J.), but have a question that doesn't seem to be answered by the documentation or the code (that I can find).

Suppose I want to perform some linear algebra operation (e.g. dask.array.dot) on a very large array that is stored in many CSV files. Is this currently possible?

I know that I can use dask.dataframe to create a dag for reading the CSV to a dataframe, but as far as I can tell there is no way to convert the dataframe to an array on which I can perform the inner product.

I also know that I can create a dag for reading in arrays directly from (for example) an hdf5 file, but that is not how the data are stored.

If this isn't currently possible, I'd be happy to work on solving the problem, but wanted to makes sure I wasn't reinventing the wheel.

Thanks!
Scott

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions