dispatch.registers to their own file#7503
Conversation
|
Thanks @jsignell this looks great generally. This is a hugely breaking change that will likely break anyone using DataFrame dispatches outside of Dask (not sure if there's anyone other than dask-cudf). I'm going to kickoff some conversations internally on our side, but if possible it would be ideal to hold off on merging this for a bit to give us a chance to get our ducks in a row. |
We should discuss if this is something that is worth doing. To my mind, it's a good clean-up, but if we are committed to breaking people, we should make sure that it's change that we are really happy with.
Absolutely |
I'm a big +1 to this change and it's much appreciated. Regardless, after the meeting this morning and this RAPIDS release, we've decided as a team that we're going to start constraining Dask to its latest release version at the time of our release instead of letting it float. Breaking changes are going in too frequently and Dask releases go out in between our releases too frequently, where repeatedly fixing our previous release is too expensive for our team to do at this time. |
|
This PR would break a lot less if the functions that are being registered were private. Is that already the implication? If it is not then I would be happy to open a smaller PR that prefixes them all with an underscore. |
607a407 to
a7b8e8f
Compare
|
Ok I looked at dask_cudf and tried to preserve the imports that are used there. Would someone be able to run the cudf tests on this PR and let me know if there are any failures? |
|
Maybe now that GTC is over :) |
|
@charlesbluca would you have time to run these tests and report back ? |
|
Ran through the dask-cudf tests, here are the results - lots of failures, but not sure if that's because of my env: $ conda list | grep -e "dask\|cudf"
cudf 0.20.0a210419 cuda_11.2_py38_g867d6eecca_166 rapidsai-nightly
dask 2.9.0+733.g881a7474 dev_0 <develop>
dask-cudf 0.20.0a210419 py38_g867d6eecca_166 rapidsai-nightly
libcudf 0.20.0a210419 cuda11.2_g808262f49c_167 rapidsai-nightlyLet me know if I should change anything and rerun the tests. |
|
here is an example of an error: @galipremsagar @kkraus14 do you have an idea of what's going on here? |
This is a bug that has to be fixed in |
@charlesbluca can you confirm if making just this change unblocks you for validating against changes in this PR? |
|
Yes, that resolves all failures - thanks @galipremsagar! Should I open a PR with cuDF for this? |
Thanks for confirming, I'm on it 👍 |
|
Thanks @charlesbluca, @quasiben, and @galipremsagar! So seems like this is safe from the dask-cudf perspective? |
Yes. Generally we are now constraining dask / distributed versions at release time to the latest released patch version so we should be more protected against breaking changes being introduced. |
|
Ok @jrbourbeau should we merge this and #7505? |
|
Can we get this PR to be merged? This will help unblock other PRs around having major changes in dispatch: #7586 |
|
+1 from me @jrbourbeau do you have any thoughts? Should we merge this and the array PR ( #7505 )? |
|
James is fixing up #7505 to match the pattern in this PR and then we'll merge both. |
| """ | ||
| Dispatch in dask.dataframe. | ||
|
|
||
| Also see extension.py |
There was a problem hiding this comment.
Nitpick: Should this be backends.py?
There was a problem hiding this comment.
I think it this is right. There are some Dispatch objects initialized in there as well. I thought they were different in some way and should stay in extension.py, but I can't remember why I thought that. Do you think they should move into this file?
There was a problem hiding this comment.
Ah, I see. Since those are specifically for registering extension array, let's keep those where they are
Requires PR ( rapidsai/cudf#8342 ) As these functions recently got relocated, handle `import`ing from the new location with a fallback to the old location. xref: dask/dask#7503 xref: dask/dask#7505 Authors: - https://github.com/jakirkham Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) - Benjamin Zaitlen (https://github.com/quasiben) URL: #623
|
I think this may have broken the use of make_meta. (I also see @GenevieveBuckley noted this). |
In #7586, there was a new function introduced that has to be invoked to make meta i.e., |
|
@galipremsagar |
|
@chrisroat could you please file this in an issue? This discussion is a bit hard to track in a merged PR, which seems unlikely to be related to the issue at hand |
black dask/flake8 dask/isort daskThese could also go into backends.py if people prefer. I have no strong opinion.