Skip to content

[Dashboard] Update dashboard port checking to prevent unexpected port change#11487

Merged
rkooo567 merged 2 commits intoray-project:masterfrom
allenyin55:update_check_dashboard_port
Oct 21, 2020
Merged

[Dashboard] Update dashboard port checking to prevent unexpected port change#11487
rkooo567 merged 2 commits intoray-project:masterfrom
allenyin55:update_check_dashboard_port

Conversation

@allenyin55
Copy link
Copy Markdown
Contributor

Why are these changes needed?

Dashboard port sometimes is incremented unexpectedly (when the default port is available) when restarting ray clusters with short delays.

The root cause seems to be the socket gets into the TIME_WAIT state and can't be immediately reused (see the last example here for more details). Setting the SO_REUSEADDR option prevents this from happening.

Related issue number

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Manually tested
    • Release tests
    • This PR is not tested :(

@allenyin55 allenyin55 changed the title Update dashboard port checking to prevent unexpected port change [Dashboard] Update dashboard port checking to prevent unexpected port change Oct 20, 2020
@allenyin55 allenyin55 requested a review from mfitton October 20, 2020 01:16
Copy link
Copy Markdown
Contributor

@mfitton mfitton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I'm glad you found the cause of this, it'd happened to me on occasion.

@allenyin55
Copy link
Copy Markdown
Contributor Author

Can we merge this? The tests failures don't seem relevant.

@mfitton
Copy link
Copy Markdown
Contributor

mfitton commented Oct 20, 2020

@allenyin55 I actually am not able to merge pull requests. @rkooo567 could you please merge?

@rkooo567
Copy link
Copy Markdown
Contributor

i guess this is not working. CI tests are all failing, and metrics agent test on Windows also fail.

@rkooo567
Copy link
Copy Markdown
Contributor

Please merge the latest master, so that I can see if the test failure is unrelated.

@allenyin55
Copy link
Copy Markdown
Contributor Author

allenyin55 commented Oct 20, 2020

Please merge the latest master, so that I can see if the test failure is unrelated.

Just did. Does it look OK now @rkooo567 ?

@rkooo567 rkooo567 merged commit 2fc3237 into ray-project:master Oct 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants