Skip to content

[core] ray stop does not kill all processes #11436

@stephanie-wang

Description

@stephanie-wang

What is the problem?

Ray version and other system information (Python version, TensorFlow version, OS): 1.1dev, Ubuntu 18.04

The problem does not appear on Ray 1.0.

Reproduction (REQUIRED)

Please provide a script that can be run to reproduce the issue. The script should have no external library dependencies (i.e., use fake or mock data / environments):

Run the following bash script to stop and restart ray:

#!/bin/bash

ray stop
echo "RAY STOP `ps -ef | grep redis`"
ray start --head --num-cpus 16
echo "RAY START `ps -ef | grep redis`"

The script will fail every other time that it runs. When it fails, the output is:

Traceback (most recent call last):
  File "/home/swang/anaconda3/envs/ray-36/bin/ray", line 8, in <module>
    sys.exit(main())
  File "/home/swang/anaconda3/envs/ray-36/lib/python3.6/site-packages/ray/scripts/scripts.py", line 1462, in main
    return cli()
  File "/home/swang/anaconda3/envs/ray-36/lib/python3.6/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/home/swang/anaconda3/envs/ray-36/lib/python3.6/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/home/swang/anaconda3/envs/ray-36/lib/python3.6/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/swang/anaconda3/envs/ray-36/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/swang/anaconda3/envs/ray-36/lib/python3.6/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/home/swang/anaconda3/envs/ray-36/lib/python3.6/site-packages/ray/scripts/scripts.py", line 479, in start
    f"Ray is already running at {default_address}. "
ConnectionError: Ray is already running at 192.168.1.46:6379. Please specify a different port using the `--port` command to `ray start`.

If we cannot run your script, we cannot fix your issue.

  • I have verified my script runs in a clean environment and reproduces the issue.
  • I have verified the issue also occurs with the latest wheels.

Metadata

Metadata

Assignees

Labels

P0Issues that should be fixed in short orderbugSomething that is supposed to be working; but isn't

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions