Skip to content

Flaky Incremental Model Selection tests #673

@TomAugspurger

Description

@TomAugspurger

tests/model_selection/test_incremental.py::test_gridsearch and tests/model_selection/test_incremental.py::test_transform occasionally fail on CI.

https://dev.azure.com/dask-dev/dask/_build/results?buildId=1154&view=logs&j=2eab4704-62a3-55d0-a524-46b534a6e811&t=24a2918a-3d47-54f3-09aa-2db1b58035d7&l=549

================================== FAILURES ===================================
_______________________________ test_gridsearch _______________________________

    def test_func():
---------------------------- Captured stderr call -----------------------------
distributed.scheduler - ERROR - Couldn't gather keys {'_score-4d2165fa-45ff-4705-90a5-5c464fb49b2a': [], '_score-d139b189-4b59-4823-a75e-76613308362a': [], '_score-2de29df9-0e9f-4d81-8032-4b41d918fb18': [], '_score-35f12ba1-3806-4b51-8307-a42da6d043ef': [], '_score-2dc00b23-6a1b-4312-8b97-a05a9d1c391c': [], '_score-fb91dd67-dee3-4e81-9bf6-1384f42bd993': []} state: [None, None, None, None, None, None] workers: []
NoneType: None
distributed.scheduler - ERROR - Workers don't have promised key: [], _score-4d2165fa-45ff-4705-90a5-5c464fb49b2a
NoneType: None
distributed.scheduler - ERROR - Workers don't have promised key: [], _score-d139b189-4b59-4823-a75e-76613308362a
NoneType: None
distributed.scheduler - ERROR - Workers don't have promised key: [], _score-2de29df9-0e9f-4d81-8032-4b41d918fb18
NoneType: None
distributed.scheduler - ERROR - Workers don't have promised key: [], _score-35f12ba1-3806-4b51-8307-a42da6d043ef
NoneType: None
distributed.scheduler - ERROR - Workers don't have promised key: [], _score-2dc00b23-6a1b-4312-8b97-a05a9d1c391c
NoneType: None
distributed.scheduler - ERROR - Workers don't have promised key: [], _score-fb91dd67-dee3-4e81-9bf6-1384f42bd993
NoneType: None
distributed.client - WARNING - Couldn't gather 6 keys, rescheduling {'_score-4d2165fa-45ff-4705-90a5-5c464fb49b2a': (), '_score-d139b189-4b59-4823-a75e-76613308362a': (), '_score-2de29df9-0e9f-4d81-8032-4b41d918fb18': (), '_score-35f12ba1-3806-4b51-8307-a42da6d043ef': (), '_score-2dc00b23-6a1b-4312-8b97-a05a9d1c391c': (), '_score-fb91dd67-dee3-4e81-9bf6-1384f42bd993': ()}
tornado.application - ERROR - Exception after Future was cancelled
Traceback (most recent call last):
  File "C:\Miniconda\envs\dask-ml-test\lib\site-packages\tornado\gen.py", line 742, in run
    yielded = self.gen.throw(*exc_info)  # type: ignore
  File "D:\a\1\s\tests\model_selection\test_incremental.py", line 405, in test_gridsearch
    yield search.fit(X, y, classes=[0, 1])
  File "C:\Miniconda\envs\dask-ml-test\lib\site-packages\tornado\gen.py", line 735, in run
    value = future.result()
  File "C:\Miniconda\envs\dask-ml-test\lib\site-packages\tornado\gen.py", line 742, in run
    yielded = self.gen.throw(*exc_info)  # type: ignore
  File "D:\a\1\s\dask_ml\model_selection\_incremental.py", line 625, in _fit
    prefix=self.prefix,
  File "C:\Miniconda\envs\dask-ml-test\lib\site-packages\tornado\gen.py", line 735, in run
    value = future.result()
  File "C:\Miniconda\envs\dask-ml-test\lib\site-packages\tornado\gen.py", line 742, in run
    yielded = self.gen.throw(*exc_info)  # type: ignore
  File "D:\a\1\s\dask_ml\model_selection\_incremental.py", line 297, in _fit
    scores = yield client.gather(scores)
  File "C:\Miniconda\envs\dask-ml-test\lib\site-packages\tornado\gen.py", line 735, in run
    value = future.result()
  File "C:\Miniconda\envs\dask-ml-test\lib\site-packages\distributed\client.py", line 1865, in _gather
    self._send_to_scheduler({"op": "report-key", "key": key})
  File "C:\Miniconda\envs\dask-ml-test\lib\site-packages\distributed\client.py", line 960, in _send_to_scheduler
    "Message: %s" % (self.status, msg)
Exception: Tried sending message after closing.  Status: closed
Message: {'op': 'report-key', 'key': '_score-4d2165fa-45ff-4705-90a5-5c464fb49b2a'}

The errors on the scheduler about seem related.

distributed.scheduler - ERROR - Workers don't have promised key: [], _score-4d2165fa-45ff-4705-90a5-5c464fb49b2a

But I can't reproduce this locally.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions