-
Notifications
You must be signed in to change notification settings - Fork 107
Closed
dask/distributed
#4575Description
In PR ( #546 ), we noticed some errors cropping up recently. Not sure exactly the cause, but they may be related to PR ( dask/distributed#4531 ). Copying some more details about what was observed in the log below
Seeing this in the log
10:58:40 File "/opt/conda/envs/rapids/lib/python3.7/site-packages/ucp/core.py", line 628, in recv
10:58:40 ret = await comm.tag_recv(self._ep, buffer, nbytes, tag, name=log)
10:58:40 ucp.exceptions.UCXMsgTruncated: <[Recv #112] ep: 0x7f0e25cde0d8, tag: 0xfa85496d273cdec2, nbytes: 260, type: <class 'numpy.ndarray'>>: length mismatch: 16 (got) != 260 (expected)
Also seeing this
11:05:40 File "/var/lib/jenkins/workspace/rapidsai/gpuci/dask-cuda/prb/dask-cuda-gpu-test/CUDA/10.1/GPU_LABEL/gpu-t4||gpu/OS/ubuntu16.04/PYTHON/3.7/dask_cuda/explicit_comms/dataframe/shuffle.py", line 196, in local_shuffle
11:05:40 out_parts[i] = None
11:05:40 TypeError: 'tuple' object does not support item assignment
11:05:40 FAILED
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels