Conversation
* Stubs for classes
…distributed into ucx+data-handling
OK, for now I've turned off compression on all cuda data. That should stop us from splitting up large frames. It looks like currently there is a limit in ucx-py that keeps us under 2**31 bytes. At first this seems to be limited by the I've also added logic to not call ensure_bytes and |
|
This now works-ish (at least when combined with some of the work in rapidsai/dask-cuda#46). I would like to get this to a point where we could merge it somewhat quickly. I don't mind things being a little rough if they are well isolated into files that aren't in the mainline code path (files like ucx.py, cuda.py, cudf.py and so on). There are a few TODO's left in the actual logic that I suspect are left by @TomAugspurger . Is this something that you can look into this week Tom to see if they are still necessary? |
| # Workaround for hanging test in | ||
| # pytest distributed/comm/tests/test_ucx.py::test_comm_objs -vs --count=2 | ||
| # on the second time through. | ||
| ucp._libs.ucp_py.reader_added = 0 |
There was a problem hiding this comment.
@Akshay-Venkatesh do you recall if this was resolved? Is it the same as https://github.com/Akshay-Venkatesh/ucx-py/issues/69, or different?
There was a problem hiding this comment.
@TomAugspurger I tested this yesterday and the issue hasn't been resolved. This still has to be fixed.
|
Thanks for the feedback @TomAugspurger . I think I can handle everything, I mostly wanted to get your thoughts on some of the comments. Thanks! |
|
I've removed the WIP label. Review appreciated. I think that this is safe to go in. |
|
OK, I'm merging this tomorrow if there are no further comments. |
|
I won't have time soon to look this over again, but based on my last review
I also think this is safe to go in.
…On Thu, May 30, 2019 at 9:25 AM Matthew Rocklin ***@***.***> wrote:
I've removed the WIP label. Review appreciated. I think that this is safe
to go in.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2591?email_source=notifications&email_token=AAKAOISDTFT25GT7376J7STPX7PVBA5CNFSM4HC4HRZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWSO7DA#issuecomment-497348492>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAKAOIXTD4ITBJEBKWUJ5LLPX7PVBANCNFSM4HC4HRZQ>
.
|
|
🎉 |
|
Thank you @TomAugspurger, @quasiben, and @Akshay-Venkatesh for working on this. I'm sure that there is still plenty more to do here, but it will be nice to have this in master. |
* upstream/master: (58 commits) Add unknown pytest markers (dask#2764) Delay lookup of allowed failures. (dask#2761) Change address -> worker in ColumnDataSource for nbytes plot (dask#2755) Remove module state in Prometheus Handlers (dask#2760) Add stress test for UCX (dask#2759) Add nanny logs (dask#2744) Move some of the adaptive logic into the scheduler (dask#2735) Add SpecCluster.new_worker_spec method (dask#2751) Worker dashboard fixes (dask#2747) Add async context managers to scheduler/worker classes (dask#2745) Fix the resource key representation before sending graphs (dask#2716) (dask#2733) Allow user to configure whether workers are daemon. (dask#2739) Pin pytest >=4 with pip in appveyor and python 3.5 (dask#2737) Add Experimental UCX Comm (dask#2591) Close nannies gracefully (dask#2731) add kwargs to progressbars (dask#2638) Add back LocalCluster.__repr__. (dask#2732) Move bokeh module to dashboard (dask#2724) Close clusters at exit (dask#2730) Add SchedulerPlugin TaskState example (dask#2622) ...
* upstream/master: (43 commits) Add unknown pytest markers (dask#2764) Delay lookup of allowed failures. (dask#2761) Change address -> worker in ColumnDataSource for nbytes plot (dask#2755) Remove module state in Prometheus Handlers (dask#2760) Add stress test for UCX (dask#2759) Add nanny logs (dask#2744) Move some of the adaptive logic into the scheduler (dask#2735) Add SpecCluster.new_worker_spec method (dask#2751) Worker dashboard fixes (dask#2747) Add async context managers to scheduler/worker classes (dask#2745) Fix the resource key representation before sending graphs (dask#2716) (dask#2733) Allow user to configure whether workers are daemon. (dask#2739) Pin pytest >=4 with pip in appveyor and python 3.5 (dask#2737) Add Experimental UCX Comm (dask#2591) Close nannies gracefully (dask#2731) add kwargs to progressbars (dask#2638) Add back LocalCluster.__repr__. (dask#2732) Move bokeh module to dashboard (dask#2724) Close clusters at exit (dask#2730) Add SchedulerPlugin TaskState example (dask#2622) ...
| header = bytes(header) | ||
| if header: | ||
| header = msgpack.loads(header, use_list=False, **msgpack_opts) |
There was a problem hiding this comment.
Do you recall why this was needed? Was this due to the if header line? Did msgpack.dumps need this? Or was it due to something else like potentially unusual types being passed in for header (like maybe a NumPy array)?
This PR brings in much of the work done by @TomAugspurger with ucx/ucx-py with @mrocklin's help I added a small change to cudf protocol handling and general cleanup.