Expose internal config parameters for starting Ray#3246
Expose internal config parameters for starting Ray#3246richardliaw merged 16 commits intoray-project:masterfrom
Conversation
|
Test FAILed. |
|
In this PR, can we bring back all of the tests that were skipped in #3217? |
|
Test FAILed. |
|
Test FAILed. |
|
Test FAILed. |
|
Test FAILed. |
|
jenkins retest this please |
|
FYI, @stephanie-wang this PR reintroduces the stress tests that were skipped in #3217. |
|
Test FAILed. |
|
Test FAILed. |
|
Test PASSed. |
|
Test FAILed. |
|
Test PASSed. |
|
|
||
| cluster.remove_node(worker) | ||
| cluster.wait_for_nodes(retries=10) | ||
| assert ray.global_state.cluster_resources()["CPU"] == 2 |
There was a problem hiding this comment.
This test looks like it could be flaky because of the reliance on timing..
Not this line in particular, just the whole test.
There was a problem hiding this comment.
Hm, the test runs on Jenkins which I think has a lot less timing issues? Also, the timing leeway is on the order of seconds.
Is there a better way to guarantee node removal other than refreshing the client table?
Alternatively, we could see if this is an issue and if we ever see it again, we could use flaky (https://github.com/box/flaky)
|
Test FAILed. |
|
Test FAILed. |
|
jenkins, retest this please |
|
Test PASSed. |
What do these changes do?
This PR exposes the CL option for using a config file. This is important for certain tests (i.e., FT tests that removing nodes) to run quickly.
Note that this is bad practice and should be replaced with GFLAGS or some equivalent as soon as possible.
#3239 depends on this.
TODO:
Related issue number