Skip to content

new_dashboard metrics agent crashed in Windows CI #13199

@simon-mo

Description

@simon-mo

What is the problem?

Ray version and other system information (Python version, TensorFlow version, OS):

2021-01-05T07:27:22.1039161Z 2021-01-05 07:26:40,486	WARNING worker.py:1044 -- The agent on node fv-az68-689 failed with the following error:
2021-01-05T07:27:22.1039976Z Traceback (most recent call last):
2021-01-05T07:27:22.1040684Z   File "d:\a\ray\ray\python\ray\new_dashboard/agent.py", line 311, in <module>
2021-01-05T07:27:22.1041457Z     loop.run_until_complete(agent.run())
2021-01-05T07:27:22.1043175Z   File "C:\hostedtoolcache\windows\Python\3.7.9\x64\lib\asyncio\base_events.py", line 587, in run_until_complete
2021-01-05T07:27:22.1044233Z     return future.result()
2021-01-05T07:27:22.1044986Z   File "d:\a\ray\ray\python\ray\new_dashboard/agent.py", line 187, in run
2021-01-05T07:27:22.1045759Z     await asyncio.gather(check_parent_task,
2021-01-05T07:27:22.1046632Z UnboundLocalError: local variable 'check_parent_task' referenced before assignment
2021-01-05T07:27:22.1047247Z 
2021-01-05T07:27:22.1047797Z �[2m�[36m(pid=None)�[0m Traceback (most recent call last):
2021-01-05T07:27:22.1048322Z 
2021-01-05T07:27:22.1049013Z �[2m�[36m(pid=None)�[0m   File "d:\a\ray\ray\python\ray\new_dashboard/agent.py", line 322, in <module>
2021-01-05T07:27:22.1049555Z 
2021-01-05T07:27:22.1050018Z �[2m�[36m(pid=None)�[0m     raise e
2021-01-05T07:27:22.1050352Z 
2021-01-05T07:27:22.1050980Z �[2m�[36m(pid=None)�[0m   File "d:\a\ray\ray\python\ray\new_dashboard/agent.py", line 311, in <module>
2021-01-05T07:27:22.1051524Z 
2021-01-05T07:27:22.1052129Z �[2m�[36m(pid=None)�[0m     loop.run_until_complete(agent.run())
2021-01-05T07:27:22.1052559Z 
2021-01-05T07:27:22.1053390Z �[2m�[36m(pid=None)�[0m   File "C:\hostedtoolcache\windows\Python\3.7.9\x64\lib\asyncio\base_events.py", line 587, in run_until_complete
2021-01-05T07:27:22.1054078Z 
2021-01-05T07:27:22.1054625Z �[2m�[36m(pid=None)�[0m     return future.result()
2021-01-05T07:27:22.1055026Z 
2021-01-05T07:27:22.1055681Z �[2m�[36m(pid=None)�[0m   File "d:\a\ray\ray\python\ray\new_dashboard/agent.py", line 187, in run
2021-01-05T07:27:22.1056156Z 
2021-01-05T07:27:22.1056751Z �[2m�[36m(pid=None)�[0m     await asyncio.gather(check_parent_task,
2021-01-05T07:27:22.1057213Z 
2021-01-05T07:27:22.1058008Z �[2m�[36m(pid=None)�[0m UnboundLocalError: local variable 'check_parent_task' referenced before assignment
2021-01-05T07:27:22.1058653Z 
2021-01-05T07:27:22.1059146Z �[2m�[36m(pid=None)�[0m --- Logging error ---
2021-01-05T07:27:22.1059512Z 
2021-01-05T07:27:22.1060051Z �[2m�[36m(pid=None)�[0m Traceback (most recent call last):
2021-01-05T07:27:22.1060458Z 
2021-01-05T07:27:22.1061184Z �[2m�[36m(pid=None)�[0m   File "C:\hostedtoolcache\windows\Python\3.7.9\x64\lib\logging\handlers.py", line 69, in emit
2021-01-05T07:27:22.1072363Z 
2021-01-05T07:27:22.1073081Z �[2m�[36m(pid=None)�[0m     if self.shouldRollover(record):
2021-01-05T07:27:22.1073557Z 
2021-01-05T07:27:22.1074417Z �[2m�[36m(pid=None)�[0m   File "C:\hostedtoolcache\windows\Python\3.7.9\x64\lib\logging\handlers.py", line 183, in shouldRollover
2021-01-05T07:27:22.1075116Z 
2021-01-05T07:27:22.1075651Z �[2m�[36m(pid=None)�[0m     self.stream = self._open()
2021-01-05T07:27:22.1076051Z 
2021-01-05T07:27:22.1076745Z �[2m�[36m(pid=None)�[0m   File "C:\hostedtoolcache\windows\Python\3.7.9\x64\lib\logging\__init__.py", line 1116, in _open
2021-01-05T07:27:22.1077345Z 
2021-01-05T07:27:22.1078106Z �[2m�[36m(pid=None)�[0m     return open(self.baseFilename, self.mode, encoding=self.encoding)
2021-01-05T07:27:22.1078730Z 
2021-01-05T07:27:22.1079292Z �[2m�[36m(pid=None)�[0m NameError: name 'open' is not defined
2021-01-05T07:27:22.1079708Z 
2021-01-05T07:27:22.1080172Z �[2m�[36m(pid=None)�[0m Call stack:
2021-01-05T07:27:22.1080503Z 
2021-01-05T07:27:22.1082589Z �[2m�[36m(pid=None)�[0m   File "C:\hostedtoolcache\windows\Python\3.7.9\x64\lib\site-packages\aiohttp\client.py", line 320, in __del__
2021-01-05T07:27:22.1083331Z 
2021-01-05T07:27:22.1083952Z �[2m�[36m(pid=None)�[0m     self._loop.call_exception_handler(context)
2021-01-05T07:27:22.1084419Z 
2021-01-05T07:27:22.1085281Z �[2m�[36m(pid=None)�[0m   File "C:\hostedtoolcache\windows\Python\3.7.9\x64\lib\asyncio\base_events.py", line 1645, in call_exception_handler
2021-01-05T07:27:22.1085985Z 
2021-01-05T07:27:22.1086595Z �[2m�[36m(pid=None)�[0m     self.default_exception_handler(context)
2021-01-05T07:27:22.1087071Z 
2021-01-05T07:27:22.1087942Z �[2m�[36m(pid=None)�[0m   File "C:\hostedtoolcache\windows\Python\3.7.9\x64\lib\asyncio\base_events.py", line 1619, in default_exception_handler
2021-01-05T07:27:22.1088655Z 
2021-01-05T07:27:22.1089225Z �[2m�[36m(pid=None)�[0m     logger.error('\n'.join(log_lines), exc_info=exc_info)
2021-01-05T07:27:22.1089710Z 
2021-01-05T07:27:22.1090674Z �[2m�[36m(pid=None)�[0m Message: 'Unclosed client session\nclient_session: <aiohttp.client.ClientSession object at 0x000002699DFEAE88>'
2021-01-05T07:27:22.1091493Z 
2021-01-05T07:27:22.1091965Z �[2m�[36m(pid=None)�[0m Arguments: ()

https://github.com/ray-project/ray/runs/1647962851

Reproduction (REQUIRED)

Please provide a short code snippet (less than 50 lines if possible) that can be copy-pasted to reproduce the issue. The snippet should have no external library dependencies (i.e., use fake or mock data / environments):

If the code snippet cannot be run by itself, the issue will be closed with "needs-repro-script".

  • I have verified my script runs in a clean environment and reproduces the issue.
  • I have verified the issue also occurs with the latest wheels.

Metadata

Metadata

Assignees

Labels

P2Important issue, but not time-criticalbugSomething that is supposed to be working; but isn'tdashboardIssues specific to the Ray Dashboardwindows

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions