Remove source and target optimization in array.store#9732
Remove source and target optimization in array.store#9732
Conversation
|
Ah I should have added, since |
|
I think conda-forge is having a difficult day today. |
|
This PR slows down my processing compared to dask 2022.12.0. With dask 2022.12.0, my script takes 09:18.68. With this PR it takes 10:14.39. I will try to come up with a MCVE that reproduces the slowdown. |
|
@gerritholl I'll check out your satpy-related issues and hopefully do some profiling today. It may be something satpy is doing against "best practice", but I'm surprised that this PR would have worsened performance unless my assumptions about where graph optimizations are being done (when compute is called) are wrong. |
|
Question for @mrocklin and @jakirkham (or others): if I give a dask array to a Delayed function and then compute the result, is the graph optimized any differently than if I had computed the dask array directly? Put another way, does optimizing a dask array graph mean additional optimizations are considered that aren't considered for a Delayed function? For Gerrit's issue, he's having trouble reproducing the slow results he was getting before. I'm going to try some other use cases with Satpy to see if I can weird results between 2022.12.0 and this PR. |
| sources_layer = Array.__dask_optimize__( | ||
| sources_hlg, list(core.flatten([e.__dask_keys__() for e in sources])) | ||
| ) |
There was a problem hiding this comment.
In PR ( #2980 ), Matt suggested we add this to ensure Array optimizations are applied before computation (since they wouldn't be for a Delayed object) ( #2980 (comment) ). So presumably if this is dropped it, these optimizations would need to be handled another way
That said, have not really closely followed the changes to graphs, optimizations, object specializations, etc. in Dask. So maybe best practices have changed. Maybe @rjzamora would know 🙂
There was a problem hiding this comment.
I think that's what I've been seeing, but results still aren't as expected. I tried doing dask.config.set(delayed_optimize=da.optimization.optimize) in hopes of making the Delayed objects get optimized as if they were Arrays (which they are below the top 2 levels or so of the graph). But even with this I still saw worse performance than dask main with my own Satpy-based example.
Overall, from a high level, my hope is that there would be something we could do so no matter where these graphs appear inside other graphs (as inputs to delayed objects, etc) that they all optimize the same when computed/optimized with other graphs. 🤞
There was a problem hiding this comment.
Admittedly the PR linked above and associated discussion is 4yrs old. It wouldn't surprise me if best practices had changed (potentially a lot) in the intervening time. So would defer to others more involved with HLG (like Rick) on what those are today.
There was a problem hiding this comment.
@jakirkham to get this PR moving forward, or at least the discussion, what are the user accounts of the specific people you think could give insight on this problem (can you mention them here)? That problem being wanting to optimize a dask graph consisting of Delayed objects with the potential for dask arrays involved. The hope being that those arrays could be optimized in the same way regardless of the final task being a Delayed object or not.
There was a problem hiding this comment.
Sorry not to be very helpful here. Will defer to @mrocklin on next steps
As discussed in #8380, there are cases where a user may want to store multiple sources or store to multiple targets that share tasks somewhere in their history. In current
maincallingda.storewithcompute=Falseresults in the individual dask graphs being optimized (fused tasks, renamed tasks, etc) which means future computations won't be able to combine/share tasks any more. In simplified code this looks like:As I was working on a fix for this, I realized that I don't think there is any need for
da.storeto optimize anything before final computation. I think all computation will optimize the graphs anyway so the only benefit to early optimization is to reduce the graphs/dictionaries that are passed around internally inda.store(a small benefit if I'm not mistaken).Lastly a question on dask PR best practice: What are the feelings about including refactorings in a PR alongside bug fixes? For example, the
da.storefunction could be split into a lot of sub-functions that would reduce cyclomatic complexity of the function and make it easier to read. But for overall appearance of the PR it could make it harder to review.pre-commit run --all-files