Skip to content

Test that dask collections can hold scipy.sparse arrays#3738

Merged
mrocklin merged 1 commit intodask:masterfrom
mrocklin:scipy-sparse
Jul 11, 2018
Merged

Test that dask collections can hold scipy.sparse arrays#3738
mrocklin merged 1 commit intodask:masterfrom
mrocklin:scipy-sparse

Conversation

@mrocklin
Copy link
Member

@mrocklin mrocklin commented Jul 8, 2018

They can't do much with them because they don't support the ndarray
interface, but it's useful for dask-ml work that we can produce them and
then hand them off to dask.delayed or da.map_blocks workflows

  • Tests added / passed
  • Passes flake8 dask

They can't do much with them because they don't support the ndarray
interface, but it's useful for dask-ml work that we can produce them and
then hand them off to dask.delayed or da.map_blocks workflows
@mrocklin
Copy link
Member Author

mrocklin commented Jul 8, 2018

cc @ogrisel @TomAugspurger

@mrocklin
Copy link
Member Author

Merging tomorrow morning if there are no comments.

@mrocklin mrocklin merged commit 02f864b into dask:master Jul 11, 2018
@mrocklin mrocklin deleted the scipy-sparse branch July 11, 2018 12:25
@ogrisel
Copy link
Contributor

ogrisel commented Jul 11, 2018

Thanks that looks good. Do you think it will help with #2842. I will try to have a look.

@ogrisel
Copy link
Contributor

ogrisel commented Jul 11, 2018

Sorry I was confused #2842 was about the sparse.COO array (not scipy sparse matrix). I will have a deeper look at the scipy.sparse usage patterns.

@mrocklin
Copy link
Member Author

No. Most Dask array operations still won't work. However I think that this will be useful for dask-ml workloads where we mostly call map_partitions or map_blocks with scikit-learn functions on dask arrays and dask dataframes.

@mrocklin
Copy link
Member Author

So dask array and dask dataframe can contain scipy sparse arrays well and we can apply functions across them, but the actual dask array and dask dataframe operations, like da.tensordot won't work.

@ogrisel
Copy link
Contributor

ogrisel commented Jul 11, 2018

Yes I just tried to play a bit with it interactively and indeed it's already useful as it is and probably enough for the dask-ml use cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants