-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Open
Labels
highlevelgraphIssues relating to HighLevelGraphs.Issues relating to HighLevelGraphs.
Description
Now that we have high level graphs in our collections, we can do some more complex optimizations. We've done a little bit of this with blockwise fusion, read_parquet and column selection, and root fusion, all of which have had really positive effects.
There is likely more that we can do here. Some thoughts ...
- Fuse column selection with other kinds of data access, like read_csv or ORC
- Pass slicing through some blockwise operations in Dask array (a long-held request by @shoyer)
- Optimize the subgraph callables in blockwise with Numba (or something else) for dask array to avoid memory copies and maybe reduce serialization time
- Swap around Joins and filters and column access
- ...
It would be nice to have a current maintainer consider these options, guess how long they would take, and see what makes sense to do short term
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
highlevelgraphIssues relating to HighLevelGraphs.Issues relating to HighLevelGraphs.