Skip to content

Add ability to specify worker and driver ports#7833

Merged
edoakes merged 18 commits intoray-project:masterfrom
edoakes:port-range
Apr 16, 2020
Merged

Add ability to specify worker and driver ports#7833
edoakes merged 18 commits intoray-project:masterfrom
edoakes:port-range

Conversation

@edoakes
Copy link
Copy Markdown
Collaborator

@edoakes edoakes commented Mar 31, 2020

Why are these changes needed?

Now that each worker and driver has its own gRPC server, this can cause problems for users running in environments with firewalls blocking some ports. This adds the ability to specify a range of ports that workers will bind on, as well as the port that the driver process binds on.

Worker ports can be specified as a range via: ray start --min-worker-port=15000 --max-worker-port=16000. This is not surfaced in ray.init().

The raylet tracks which of the ports are used - when workers and drivers connect to it, it will provide a port that they should bind on.

By default, the range is set to --min-worker-port=10000 and --max-worker-port=10999.

Related issue number

Closes #7632

Checks

@AmplabJenkins
Copy link
Copy Markdown

Can one of the admins verify this patch?

@AmplabJenkins
Copy link
Copy Markdown

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/24038/
Test PASSed.

@ericl ericl self-assigned this Mar 31, 2020
@AmplabJenkins
Copy link
Copy Markdown

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/24040/
Test FAILed.

@AmplabJenkins
Copy link
Copy Markdown

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/24041/
Test FAILed.

@ericl
Copy link
Copy Markdown
Contributor

ericl commented Apr 3, 2020

Can we set this by default? Would be great to say you just need to allow this port range out of the box.

task_execution_callback_(task_execution_callback),
resource_ids_(new ResourceMappingType()),
grpc_service_(io_service_, *this) {
RAY_LOG(ERROR) << "Starting worker on port: " << worker_port;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
RAY_LOG(ERROR) << "Starting worker on port: " << worker_port;
RAY_LOG(DEBUG) << "Starting worker on port: " << worker_port;

@edoakes
Copy link
Copy Markdown
Collaborator Author

edoakes commented Apr 13, 2020

@clarkzinzow this will now handle allocating driver ports as well as worker ports, hope that works for you!

@AmplabJenkins
Copy link
Copy Markdown

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/24648/
Test FAILed.

@AmplabJenkins
Copy link
Copy Markdown

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/24647/
Test FAILed.

@AmplabJenkins
Copy link
Copy Markdown

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/24651/
Test PASSed.

@AmplabJenkins
Copy link
Copy Markdown

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/24653/
Test PASSed.

@AmplabJenkins
Copy link
Copy Markdown

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/24650/
Test FAILed.

@AmplabJenkins
Copy link
Copy Markdown

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/24652/
Test FAILed.

@AmplabJenkins
Copy link
Copy Markdown

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/24702/
Test FAILed.

@AmplabJenkins
Copy link
Copy Markdown

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/24799/
Test FAILed.

@AmplabJenkins
Copy link
Copy Markdown

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/24800/
Test PASSed.

@edoakes edoakes merged commit 9f751ff into ray-project:master Apr 16, 2020
edoakes added a commit to edoakes/ray that referenced this pull request Apr 17, 2020
edoakes added a commit that referenced this pull request Apr 17, 2020
edoakes added a commit to edoakes/ray that referenced this pull request Apr 17, 2020
@arsedler9
Copy link
Copy Markdown
Contributor

Hey @edoakes, sorry to bother you again but is there a plan for merging this back into master? I couldn't get things to work when I checked out the merge commit here. Would appreciate any updates. Thanks!

@edoakes
Copy link
Copy Markdown
Collaborator Author

edoakes commented May 5, 2020

@arsedler9 yes, I've been working on this intermittently. Unfortunately there are some very frustrating issues in our CI that I'm wrestling against. I will do my best to have it in this week. Sorry for the delay.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Local cluster YAML no longer working in 0.9.0.dev0

5 participants