[Core] Add fallback strategy scheduling logic#56369
[Core] Add fallback strategy scheduling logic#56369edoakes merged 36 commits intoray-project:masterfrom
Conversation
|
The python changes are in this separate PR: #56374 which can be merged first. This PR will then contain only the C++ scheduling logic changes. cc: @MengjinYan |
|
@ryanaoleary could you rebase? |
8c94ede to
5f35328
Compare
This PR contains only the python changes from #56369, adding `fallback_strategy` as an option to the remote decorator of Tasks/Actors. Fallback strategy consists of a list of dict of decorator options. The dict of decorator options are evaluated together, and the first satisfied strategy dict is scheduled. With this PR, the only supported option is `label_selector`. Example using `fallback_strategy` to schedule on different instance types: ``` @ray.remote( label_selector={"instance_type": "m5.16xlarge"}, fallback_strategy=[ # Fall back to selector for a "m5.large" instance type if "m5.16xlarge" # cannot be satisfied. {"label_selector": {"instance_type": "m5.large"}}, # Finally, fall back to an empty set of labels (no constraints). # neither desired m5 type can be sastisfied. {"label_selector": {}}, ], ) class A: pass ``` In the above field, first the `label_selector` field will be tried. Then, the scheduler will iterate through each dict in `fallback_strategy` and attempt to scheduling using the label selector specified there (first `{"instance_type": "m5.large"}` and then the empty set). The first satisfied `label_selector` will be scheduled. #51564 --------- Signed-off-by: Ryan O'Leary <ryanaoleary@google.com> Co-authored-by: Mengjin Yan <mengjinyan3@gmail.com> Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
…project#56374) This PR contains only the python changes from ray-project#56369, adding `fallback_strategy` as an option to the remote decorator of Tasks/Actors. Fallback strategy consists of a list of dict of decorator options. The dict of decorator options are evaluated together, and the first satisfied strategy dict is scheduled. With this PR, the only supported option is `label_selector`. Example using `fallback_strategy` to schedule on different instance types: ``` @ray.remote( label_selector={"instance_type": "m5.16xlarge"}, fallback_strategy=[ # Fall back to selector for a "m5.large" instance type if "m5.16xlarge" # cannot be satisfied. {"label_selector": {"instance_type": "m5.large"}}, # Finally, fall back to an empty set of labels (no constraints). # neither desired m5 type can be sastisfied. {"label_selector": {}}, ], ) class A: pass ``` In the above field, first the `label_selector` field will be tried. Then, the scheduler will iterate through each dict in `fallback_strategy` and attempt to scheduling using the label selector specified there (first `{"instance_type": "m5.large"}` and then the empty set). The first satisfied `label_selector` will be scheduled. ray-project#51564 --------- Signed-off-by: Ryan O'Leary <ryanaoleary@google.com> Co-authored-by: Mengjin Yan <mengjinyan3@gmail.com> Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
ed07570 to
7a0f94b
Compare
|
cc: @MengjinYan rebased and fixed all the merge conflicts |
…project#56374) This PR contains only the python changes from ray-project#56369, adding `fallback_strategy` as an option to the remote decorator of Tasks/Actors. Fallback strategy consists of a list of dict of decorator options. The dict of decorator options are evaluated together, and the first satisfied strategy dict is scheduled. With this PR, the only supported option is `label_selector`. Example using `fallback_strategy` to schedule on different instance types: ``` @ray.remote( label_selector={"instance_type": "m5.16xlarge"}, fallback_strategy=[ # Fall back to selector for a "m5.large" instance type if "m5.16xlarge" # cannot be satisfied. {"label_selector": {"instance_type": "m5.large"}}, # Finally, fall back to an empty set of labels (no constraints). # neither desired m5 type can be sastisfied. {"label_selector": {}}, ], ) class A: pass ``` In the above field, first the `label_selector` field will be tried. Then, the scheduler will iterate through each dict in `fallback_strategy` and attempt to scheduling using the label selector specified there (first `{"instance_type": "m5.large"}` and then the empty set). The first satisfied `label_selector` will be scheduled. ray-project#51564 --------- Signed-off-by: Ryan O'Leary <ryanaoleary@google.com> Co-authored-by: Mengjin Yan <mengjinyan3@gmail.com> Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com> Signed-off-by: Josh Kodi <joshkodi@gmail.com>
…project#56374) This PR contains only the python changes from ray-project#56369, adding `fallback_strategy` as an option to the remote decorator of Tasks/Actors. Fallback strategy consists of a list of dict of decorator options. The dict of decorator options are evaluated together, and the first satisfied strategy dict is scheduled. With this PR, the only supported option is `label_selector`. Example using `fallback_strategy` to schedule on different instance types: ``` @ray.remote( label_selector={"instance_type": "m5.16xlarge"}, fallback_strategy=[ # Fall back to selector for a "m5.large" instance type if "m5.16xlarge" # cannot be satisfied. {"label_selector": {"instance_type": "m5.large"}}, # Finally, fall back to an empty set of labels (no constraints). # neither desired m5 type can be sastisfied. {"label_selector": {}}, ], ) class A: pass ``` In the above field, first the `label_selector` field will be tried. Then, the scheduler will iterate through each dict in `fallback_strategy` and attempt to scheduling using the label selector specified there (first `{"instance_type": "m5.large"}` and then the empty set). The first satisfied `label_selector` will be scheduled. ray-project#51564 --------- Signed-off-by: Ryan O'Leary <ryanaoleary@google.com> Co-authored-by: Mengjin Yan <mengjinyan3@gmail.com> Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
…project#56374) This PR contains only the python changes from ray-project#56369, adding `fallback_strategy` as an option to the remote decorator of Tasks/Actors. Fallback strategy consists of a list of dict of decorator options. The dict of decorator options are evaluated together, and the first satisfied strategy dict is scheduled. With this PR, the only supported option is `label_selector`. Example using `fallback_strategy` to schedule on different instance types: ``` @ray.remote( label_selector={"instance_type": "m5.16xlarge"}, fallback_strategy=[ # Fall back to selector for a "m5.large" instance type if "m5.16xlarge" # cannot be satisfied. {"label_selector": {"instance_type": "m5.large"}}, # Finally, fall back to an empty set of labels (no constraints). # neither desired m5 type can be sastisfied. {"label_selector": {}}, ], ) class A: pass ``` In the above field, first the `label_selector` field will be tried. Then, the scheduler will iterate through each dict in `fallback_strategy` and attempt to scheduling using the label selector specified there (first `{"instance_type": "m5.large"}` and then the empty set). The first satisfied `label_selector` will be scheduled. ray-project#51564 --------- Signed-off-by: Ryan O'Leary <ryanaoleary@google.com> Co-authored-by: Mengjin Yan <mengjinyan3@gmail.com> Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
…project#56374) This PR contains only the python changes from ray-project#56369, adding `fallback_strategy` as an option to the remote decorator of Tasks/Actors. Fallback strategy consists of a list of dict of decorator options. The dict of decorator options are evaluated together, and the first satisfied strategy dict is scheduled. With this PR, the only supported option is `label_selector`. Example using `fallback_strategy` to schedule on different instance types: ``` @ray.remote( label_selector={"instance_type": "m5.16xlarge"}, fallback_strategy=[ # Fall back to selector for a "m5.large" instance type if "m5.16xlarge" # cannot be satisfied. {"label_selector": {"instance_type": "m5.large"}}, # Finally, fall back to an empty set of labels (no constraints). # neither desired m5 type can be sastisfied. {"label_selector": {}}, ], ) class A: pass ``` In the above field, first the `label_selector` field will be tried. Then, the scheduler will iterate through each dict in `fallback_strategy` and attempt to scheduling using the label selector specified there (first `{"instance_type": "m5.large"}` and then the empty set). The first satisfied `label_selector` will be scheduled. ray-project#51564 --------- Signed-off-by: Ryan O'Leary <ryanaoleary@google.com> Co-authored-by: Mengjin Yan <mengjinyan3@gmail.com> Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com> Signed-off-by: xgui <xgui@anyscale.com>
Signed-off-by: Ryan O'Leary <ryanaoleary@google.com> Add tests and fix scheduling logic Signed-off-by: Ryan O'Leary <ryanaoleary@google.com> remove cgroup change Signed-off-by: Ryan O'Leary <ryanaoleary@google.com> Fix merge Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
Oh this is because we sort the values inside |
Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
…llbackOption Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
I think we actually want the sorting since the values are stored in an unordered set, but I fixed the tests to account for this in: bdf0ef8. |
Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
MengjinYan
left a comment
There was a problem hiding this comment.
Thanks! Just one minor followup question on the ToProto function.
Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
Yeah I think we should, since we're using writing to the label selector field in a larger Proto and this will avoid copies. Done in |
Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>
edoakes
left a comment
There was a problem hiding this comment.
Looks great. Only nits that can be addressed in follow up PRs.
|
|
||
|
|
||
| def test_fallback_strategy(cluster_with_labeled_nodes): | ||
| # Create a RayCluster with labelled nodes. |
There was a problem hiding this comment.
| # Create a RayCluster with labelled nodes. | |
| # Create a RayCluster with labeled nodes. |
| ).remote() | ||
|
|
||
| # Assert that the actor was scheduled on the expected node. | ||
| assert ray.get(label_selector_actor.get_node_id.remote(), timeout=5) == gpu_node |
There was a problem hiding this comment.
timeout is a little tight (CI can be slow), would loosen it
| # Assert that the actor was scheduled on the expected node. | ||
| assert ray.get(label_selector_actor.get_node_id.remote(), timeout=5) in { | ||
| node_1, | ||
| node_2, | ||
| node_3, | ||
| } |
There was a problem hiding this comment.
consider making the test more deterministic by scheduling 3 actors in parallel, each that occupies all CPUs on each node, and asserting that all 3 nodes are occupied by one of the actors
| std::vector<std::reference_wrapper<const LabelSelector>> label_selectors; | ||
| label_selectors.push_back(std::cref(lease_spec.GetLabelSelector())); |
There was a problem hiding this comment.
I've never seen reference_wrapper and std::cref before -- is this just a special way to have a vector of const refs and avoid copying into the vector?
There was a problem hiding this comment.
Yeah exactly, std::vector<const LabelSelector&> wouldn't compile so I found those helpers. I only wanted to store the references to avoid copying unnecessarily into 1 list when we already had the original objects.
| requires_object_store_memory); | ||
|
|
||
| // Use the label selector from the highest-priority fallback that was feasible. | ||
| // There must be at least one feasible node and selector. |
There was a problem hiding this comment.
| // There must be at least one feasible node and selector. | |
| // There must be at least one feasible node and selector, else we would have returned early above. |
Nice thank you, I'll fix the typo/comment and update the tests in a follow-up PR. |
This PR also updates the cluster resource scheduler logic to account for the list of `LabelSelector`s specified by the `fallback_strategy`, falling back to each fallback strategy `LabelSelector` in-order until one is satisfied when selecting the best node. We're able to support fallback selectors by considering them in the cluster resource scheduler in-order using the existing label selector logic in `IsFeasible` and `IsAvailable`, returning the first valid node returned by `GetBestSchedulableNode`. ray-project#51564 --------- Signed-off-by: Ryan O'Leary <ryanaoleary@google.com> Signed-off-by: Ryan O'Leary <113500783+ryanaoleary@users.noreply.github.com> Co-authored-by: Mengjin Yan <mengjinyan3@gmail.com>
…project#56374) This PR contains only the python changes from ray-project#56369, adding `fallback_strategy` as an option to the remote decorator of Tasks/Actors. Fallback strategy consists of a list of dict of decorator options. The dict of decorator options are evaluated together, and the first satisfied strategy dict is scheduled. With this PR, the only supported option is `label_selector`. Example using `fallback_strategy` to schedule on different instance types: ``` @ray.remote( label_selector={"instance_type": "m5.16xlarge"}, fallback_strategy=[ # Fall back to selector for a "m5.large" instance type if "m5.16xlarge" # cannot be satisfied. {"label_selector": {"instance_type": "m5.large"}}, # Finally, fall back to an empty set of labels (no constraints). # neither desired m5 type can be sastisfied. {"label_selector": {}}, ], ) class A: pass ``` In the above field, first the `label_selector` field will be tried. Then, the scheduler will iterate through each dict in `fallback_strategy` and attempt to scheduling using the label selector specified there (first `{"instance_type": "m5.large"}` and then the empty set). The first satisfied `label_selector` will be scheduled. ray-project#51564 --------- Signed-off-by: Ryan O'Leary <ryanaoleary@google.com> Co-authored-by: Mengjin Yan <mengjinyan3@gmail.com> Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
This PR also updates the cluster resource scheduler logic to account for the list of `LabelSelector`s specified by the `fallback_strategy`, falling back to each fallback strategy `LabelSelector` in-order until one is satisfied when selecting the best node. We're able to support fallback selectors by considering them in the cluster resource scheduler in-order using the existing label selector logic in `IsFeasible` and `IsAvailable`, returning the first valid node returned by `GetBestSchedulableNode`. ray-project#51564 --------- Signed-off-by: Ryan O'Leary <ryanaoleary@google.com> Signed-off-by: Ryan O'Leary <113500783+ryanaoleary@users.noreply.github.com> Co-authored-by: Mengjin Yan <mengjinyan3@gmail.com>
…project#56374) This PR contains only the python changes from ray-project#56369, adding `fallback_strategy` as an option to the remote decorator of Tasks/Actors. Fallback strategy consists of a list of dict of decorator options. The dict of decorator options are evaluated together, and the first satisfied strategy dict is scheduled. With this PR, the only supported option is `label_selector`. Example using `fallback_strategy` to schedule on different instance types: ``` @ray.remote( label_selector={"instance_type": "m5.16xlarge"}, fallback_strategy=[ # Fall back to selector for a "m5.large" instance type if "m5.16xlarge" # cannot be satisfied. {"label_selector": {"instance_type": "m5.large"}}, # Finally, fall back to an empty set of labels (no constraints). # neither desired m5 type can be sastisfied. {"label_selector": {}}, ], ) class A: pass ``` In the above field, first the `label_selector` field will be tried. Then, the scheduler will iterate through each dict in `fallback_strategy` and attempt to scheduling using the label selector specified there (first `{"instance_type": "m5.large"}` and then the empty set). The first satisfied `label_selector` will be scheduled. ray-project#51564 --------- Signed-off-by: Ryan O'Leary <ryanaoleary@google.com> Co-authored-by: Mengjin Yan <mengjinyan3@gmail.com> Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com> Signed-off-by: Aydin Abiar <aydin@anyscale.com>
This PR also updates the cluster resource scheduler logic to account for the list of `LabelSelector`s specified by the `fallback_strategy`, falling back to each fallback strategy `LabelSelector` in-order until one is satisfied when selecting the best node. We're able to support fallback selectors by considering them in the cluster resource scheduler in-order using the existing label selector logic in `IsFeasible` and `IsAvailable`, returning the first valid node returned by `GetBestSchedulableNode`. ray-project#51564 --------- Signed-off-by: Ryan O'Leary <ryanaoleary@google.com> Signed-off-by: Ryan O'Leary <113500783+ryanaoleary@users.noreply.github.com> Co-authored-by: Mengjin Yan <mengjinyan3@gmail.com> Signed-off-by: Aydin Abiar <aydin@anyscale.com>
…project#56374) This PR contains only the python changes from ray-project#56369, adding `fallback_strategy` as an option to the remote decorator of Tasks/Actors. Fallback strategy consists of a list of dict of decorator options. The dict of decorator options are evaluated together, and the first satisfied strategy dict is scheduled. With this PR, the only supported option is `label_selector`. Example using `fallback_strategy` to schedule on different instance types: ``` @ray.remote( label_selector={"instance_type": "m5.16xlarge"}, fallback_strategy=[ # Fall back to selector for a "m5.large" instance type if "m5.16xlarge" # cannot be satisfied. {"label_selector": {"instance_type": "m5.large"}}, # Finally, fall back to an empty set of labels (no constraints). # neither desired m5 type can be sastisfied. {"label_selector": {}}, ], ) class A: pass ``` In the above field, first the `label_selector` field will be tried. Then, the scheduler will iterate through each dict in `fallback_strategy` and attempt to scheduling using the label selector specified there (first `{"instance_type": "m5.large"}` and then the empty set). The first satisfied `label_selector` will be scheduled. ray-project#51564 --------- Signed-off-by: Ryan O'Leary <ryanaoleary@google.com> Co-authored-by: Mengjin Yan <mengjinyan3@gmail.com> Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com> Signed-off-by: Future-Outlier <eric901201@gmail.com>
This PR also updates the cluster resource scheduler logic to account for the list of `LabelSelector`s specified by the `fallback_strategy`, falling back to each fallback strategy `LabelSelector` in-order until one is satisfied when selecting the best node. We're able to support fallback selectors by considering them in the cluster resource scheduler in-order using the existing label selector logic in `IsFeasible` and `IsAvailable`, returning the first valid node returned by `GetBestSchedulableNode`. ray-project#51564 --------- Signed-off-by: Ryan O'Leary <ryanaoleary@google.com> Signed-off-by: Ryan O'Leary <113500783+ryanaoleary@users.noreply.github.com> Co-authored-by: Mengjin Yan <mengjinyan3@gmail.com> Signed-off-by: Future-Outlier <eric901201@gmail.com>
This PR also updates the cluster resource scheduler logic to account for the list of `LabelSelector`s specified by the `fallback_strategy`, falling back to each fallback strategy `LabelSelector` in-order until one is satisfied when selecting the best node. We're able to support fallback selectors by considering them in the cluster resource scheduler in-order using the existing label selector logic in `IsFeasible` and `IsAvailable`, returning the first valid node returned by `GetBestSchedulableNode`. ray-project#51564 --------- Signed-off-by: Ryan O'Leary <ryanaoleary@google.com> Signed-off-by: Ryan O'Leary <113500783+ryanaoleary@users.noreply.github.com> Co-authored-by: Mengjin Yan <mengjinyan3@gmail.com> Signed-off-by: peterxcli <peterxcli@gmail.com>
Why are these changes needed?
This PR also updates the cluster resource scheduler logic to account for the list of
LabelSelectors specified by thefallback_strategy, falling back to each fallback strategyLabelSelectorin-order until one is satisfied when selecting the best node. We're able to support fallback selectors by considering them in the cluster resource scheduler in-order using the existing label selector logic inIsFeasibleandIsAvailable, returning the first valid node returned byGetBestSchedulableNode.Related issue number
#51564
Checks
git commit -s) in this PR.scripts/format.shto lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/under thecorresponding
.rstfile.