Deprecate UCX-Py backends#120
Conversation
|
I think this should be good to go, and we need it before rapidsai/ucxx#504 can be merged. |
TomAugspurger
left a comment
There was a problem hiding this comment.
a warning and exception are raised telling the user to install it.
I suspect we don't need a warning and an exception. Just an exception with a clear error message would be sufficient. But since we're matching was done in distributed it's not a big deal.
I wanted to be sure this is visible in all cases (with subprocesses, without subprocesses, with/without logging visible etc.). It always confuses me what is more useful/readable in Dask, so having both seemed like a good idea. |
|
I'll go ahead and merge this so I can still test the other PR today, hopefully this won't break anything. If anybody has further comments I'd be happy to address in a follow-up PR. Thanks Tom for the review! |
|
/merge |
Rewrite the UCXX/UCX-Py proxy backends added in #116 to deprecate UCX-Py. We have already removed the UCX-Py backend from Distributed in dask/distributed#9105, but while there's no release we cannot rely on that, so the changes in the PR work in the same way we expect when the migration is completed and Distributed has a new release. If distributed-ucxx is not installed a warning and exception are raised telling the user to install it.
The state after this PR is as follows:
protocol="ucx"/protocol="ucxx"should continue working as before, both were already using distributed-ucxxprotocol="ucx-old"(only exists in rapids-dask-dependency added in Add UCXX proxy backend #116) will raise the sameFutureWarningasprotocol="ucx"without RAPIDSprotocol="ucx"/protocol="ucxx"/protocol="ucx-old"all raise aFutureWarningand don’t let users create processes.The final state is expected to be achieved when a new Distributed release is cut and the
__rdd_patch_ucx.pyfile is removed from rapids-dask-dependency, which will be:protocol="ucx"(now transparently uses distributed-ucxx) /protocol="ucxx"(always used distributed-ucxx) should continue working as beforeprotocol="ucx"raises aFutureWarningand doesn’t let users create processesprotocol="ucxx"never existed before distributed-ucxx so it will just fail with a wrong protocol nameSpecifically I've tested the following to confirm what's behavior matches what is described above as the state of this PR (using all
ucx/ucxx/ucx-oldcombinations):dask scheduler --protocol=ucxdask worker ucx://127.0.0.1:8786LocalCluster(protocol="ucx", silence_logs=False)LocalCUDACluster(protocol="ucx", silence_logs=False)