Handle dask deprecation in cycle_spin#3205
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3205 +/- ##
=========================================
+ Coverage 86.08% 86.8% +0.71%
=========================================
Files 338 339 +1
Lines 27238 27446 +208
=========================================
+ Hits 23448 23824 +376
+ Misses 3790 3622 -168
Continue to review full report at Codecov.
|
|
Tested to also work on our minimum requirement for dask, 0.9.0. |
skimage/restoration/_cycle_spin.py
Outdated
| futures = [dask.delayed(_run_one_shift)(s) for s in all_shifts] | ||
| mean = sum(futures) / len(futures) | ||
| mean = mean.compute(get=dask.threaded.get, num_workers=num_workers) | ||
| mean = mean.compute(scheduler='processes', |
There was a problem hiding this comment.
I thought processes would have a higher overhead than threads for cases like the demo in the docs. I just timed it and it is nearly identical for either choice, though, so I think either is fine.
Do you think we should expose the scheduler argument to the user?
There was a problem hiding this comment.
It does have a bit of overhead, although, it seems to me to be a better choice in general.
Do you think we should expose the scheduler argument to the user?
Actually, yes! As I understood, dask scheduler can be configured globally via YAML files, env.variables, or dask.config.set. That said, I'd drop this argument.
There was a problem hiding this comment.
Yeah, dropping this argument would let the user settings in Dask take precedence.
cc @mrocklin (in case I've missed anything)
There was a problem hiding this comment.
Yeah, I'd drop all of the arguments to this method call. If you wanted to pass things through that would be fine, but it's generally preferred not to hard-code anything.
|
Even better. |
|
@grlee77 does the change look good to you? |
|
Yes. Thanks @soupault |
|
@jni this needs backporting to fix a warning. |
|
@meeseeksdev backport to v0.14.x |
|
Awww, sorry hmaarrfk you do not seem to be allowed to do that, please ask a repository maintainer. |
|
I'm not so sure. Image analysis is often embarressingly parallel. Dask multiprocessing is making my code N-Cores times faster sometimes!!! |
|
@hmaarrfk I mean the change from |
|
By the way, it would be fantastic if you could provide some good dask usage examples for the gallery. For most of the stuff I've tried, I get a speedup much smaller than |
|
Its mostly for batch processing really. I have super large sets of images., slicing them àla numpy makes things really easy. Then when I'm ready, I launch the computation. |
|
You say that like it's not useful. =P A batch processing example in the gallery would be awesome. =) |
|
I'll try to write one for dask showing some batch processing. |
Description
See http://dask.pydata.org/en/latest/scheduling.html#local-threads and http://dask.pydata.org/en/latest/scheduling.html#local-processes .
For reviewers
(Don't remove the checklist below.)
later.
__init__.py.doc/release/release_dev.rst.@meeseeksdev backport to v0.14.x