Skip to content

Handle dask deprecation in cycle_spin#3205

Merged
grlee77 merged 2 commits intoscikit-image:masterfrom
soupault:dask_warning
Jul 7, 2018
Merged

Handle dask deprecation in cycle_spin#3205
grlee77 merged 2 commits intoscikit-image:masterfrom
soupault:dask_warning

Conversation

@soupault
Copy link
Copy Markdown
Member

@soupault soupault commented Jun 16, 2018

Description

========================================================== warnings summary ===========================================================
tests/test_denoise.py::test_cycle_spinning_multichannel
  /venv/lib/python3.5/site-packages/dask/base.py:833: UserWarning: The get= keyword has been deprecated. Please use the scheduler= keyword instead with the name of the desired scheduler like 'threads' or 'processes'
    warnings.warn("The get= keyword has been deprecated. "

See http://dask.pydata.org/en/latest/scheduling.html#local-threads and http://dask.pydata.org/en/latest/scheduling.html#local-processes .

For reviewers

(Don't remove the checklist below.)

  • Check that the PR title is short, concise, and will make sense 1 year
    later.
  • Check that new functions are imported in corresponding __init__.py.
  • Check that new features, API changes, and deprecations are mentioned in
    doc/release/release_dev.rst.
  • Consider backporting the PR with @meeseeksdev backport to v0.14.x

@soupault soupault added the 🔧 type: Maintenance Refactoring and maintenance of internals label Jun 16, 2018
@soupault soupault requested a review from grlee77 June 16, 2018 08:11
@codecov-io
Copy link
Copy Markdown

codecov-io commented Jun 16, 2018

Codecov Report

Merging #3205 into master will increase coverage by 0.71%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #3205      +/-   ##
=========================================
+ Coverage   86.08%   86.8%   +0.71%     
=========================================
  Files         338     339       +1     
  Lines       27238   27446     +208     
=========================================
+ Hits        23448   23824     +376     
+ Misses       3790    3622     -168
Impacted Files Coverage Δ
skimage/restoration/_cycle_spin.py 100% <100%> (ø) ⬆️
skimage/morphology/tests/test_extrema.py 99.14% <0%> (-0.04%) ⬇️
skimage/color/colorconv.py 99.09% <0%> (-0.01%) ⬇️
skimage/util/lookfor.py 100% <0%> (ø)
skimage/external/tifffile/tifffile.py 43.15% <0%> (+0.03%) ⬆️
skimage/viewer/canvastools/painttool.py 90.47% <0%> (+0.09%) ⬆️
skimage/draw/_random_shapes.py 96.84% <0%> (+1.05%) ⬆️
skimage/morphology/extrema.py 96.26% <0%> (+2.23%) ⬆️
skimage/morphology/watershed.py 93.93% <0%> (+2.45%) ⬆️
skimage/io/_plugins/util.py 58.92% <0%> (+4.76%) ⬆️
... and 23 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 18f97d8...6845636. Read the comment docs.

@soupault soupault added this to the 0.15 milestone Jun 18, 2018
@stefanv
Copy link
Copy Markdown
Member

stefanv commented Jun 19, 2018

Tested to also work on our minimum requirement for dask, 0.9.0.

futures = [dask.delayed(_run_one_shift)(s) for s in all_shifts]
mean = sum(futures) / len(futures)
mean = mean.compute(get=dask.threaded.get, num_workers=num_workers)
mean = mean.compute(scheduler='processes',
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought processes would have a higher overhead than threads for cases like the demo in the docs. I just timed it and it is nearly identical for either choice, though, so I think either is fine.

Do you think we should expose the scheduler argument to the user?

Copy link
Copy Markdown
Member Author

@soupault soupault Jun 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does have a bit of overhead, although, it seems to me to be a better choice in general.

Do you think we should expose the scheduler argument to the user?

Actually, yes! As I understood, dask scheduler can be configured globally via YAML files, env.variables, or dask.config.set. That said, I'd drop this argument.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, dropping this argument would let the user settings in Dask take precedence.

cc @mrocklin (in case I've missed anything)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'd drop all of the arguments to this method call. If you wanted to pass things through that would be fine, but it's generally preferred not to hard-code anything.

@stefanv
Copy link
Copy Markdown
Member

stefanv commented Jun 19, 2018

Even better.

@soupault
Copy link
Copy Markdown
Member Author

soupault commented Jul 7, 2018

@grlee77 does the change look good to you?

@grlee77 grlee77 merged commit 75f74d5 into scikit-image:master Jul 7, 2018
@grlee77
Copy link
Copy Markdown
Contributor

grlee77 commented Jul 7, 2018

Yes. Thanks @soupault

@hmaarrfk
Copy link
Copy Markdown
Member

@jni this needs backporting to fix a warning.

@hmaarrfk
Copy link
Copy Markdown
Member

@meeseeksdev backport to v0.14.x

@lumberbot-app
Copy link
Copy Markdown

lumberbot-app bot commented Aug 15, 2018

Awww, sorry hmaarrfk you do not seem to be allowed to do that, please ask a repository maintainer.

@jni
Copy link
Copy Markdown
Member

jni commented Aug 15, 2018

Man I hate this change in dask. =(

Anyway, this'll get backported in #3304 thanks to @hmaarrfk

@hmaarrfk
Copy link
Copy Markdown
Member

I'm not so sure. Image analysis is often embarressingly parallel. Dask multiprocessing is making my code N-Cores times faster sometimes!!!

@jni
Copy link
Copy Markdown
Member

jni commented Aug 15, 2018

@hmaarrfk I mean the change from get : callable to scheduler : str

@jni
Copy link
Copy Markdown
Member

jni commented Aug 15, 2018

By the way, it would be fantastic if you could provide some good dask usage examples for the gallery. For most of the stuff I've tried, I get a speedup much smaller than n_cores.

@hmaarrfk
Copy link
Copy Markdown
Member

Its mostly for batch processing really.

I have super large sets of images., slicing them àla numpy makes things really easy. Then when I'm ready, I launch the computation.

@jni
Copy link
Copy Markdown
Member

jni commented Aug 15, 2018

You say that like it's not useful. =P A batch processing example in the gallery would be awesome. =)

@hmaarrfk
Copy link
Copy Markdown
Member

I'll try to write one for dask showing some batch processing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

🔧 type: Maintenance Refactoring and maintenance of internals

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants