Skip to content

UCX: use nbytes instead of len#4621

Merged
quasiben merged 1 commit intodask:mainfrom
madsbk:ucx_use_nbytes
Mar 24, 2021
Merged

UCX: use nbytes instead of len#4621
quasiben merged 1 commit intodask:mainfrom
madsbk:ucx_use_nbytes

Conversation

@madsbk
Copy link
Contributor

@madsbk madsbk commented Mar 23, 2021

Use nbytes() to check for empty buffer objects since nbytes() != len().
This fixes the first part of rapidsai/dask-cuda#549 where the send and recv got out of sync because they didn't agree on which empty frames to skip.

10:58:40   File "/opt/conda/envs/rapids/lib/python3.7/site-packages/ucp/core.py", line 628, in recv
10:58:40     ret = await comm.tag_recv(self._ep, buffer, nbytes, tag, name=log)
10:58:40 ucp.exceptions.UCXMsgTruncated: <[Recv #112] ep: 0x7f0e25cde0d8, tag: 0xfa85496d273cdec2, nbytes: 260, type: <class 'numpy.ndarray'>>: length mismatch: 16 (got) != 260 (expected)
  • Tests added / passed
  • Passes black distributed / flake8 distributed

cc. @jakirkham

@jakirkham
Copy link
Member

Thanks Mads! 😀

So I thought frames would be flattened by this point. Do you know why that is no longer the case? Where does this come up?

@madsbk
Copy link
Contributor Author

madsbk commented Mar 23, 2021

So I thought frames would be flattened by this point. Do you know why that is no longer the case? Where does this come up?

The explicit-comms tests in Dask-CUDA triggers the bug. I think it is because of empty NumPy arrays:

len(a = np.empty((2,0)) == 2

@quasiben
Copy link
Member

I believe the failure here test_steal_resource_restrictions is unrelated to this PR. Merging in. Thanks @madsbk for the work and @jakirkham for the review

@quasiben quasiben merged commit df8f892 into dask:main Mar 24, 2021
rapids-bot bot pushed a commit to rapidsai/dask-cuda that referenced this pull request Mar 24, 2021
Fixes #549 by converting received tuples to lists.

Depend on dask/distributed#4621, which fixes an unrelated bug also triggered by our explicit-comms tests.

Authors:
  - Mads R. B. Kristensen (@madsbk)

Approvers:
  - Peter Andreas Entschev (@pentschev)

URL: #555
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants