-
Notifications
You must be signed in to change notification settings - Fork 7.4k
Description
After reading the "Affinity and Anti-affinity API" (https://docs.google.com/document/d/1S-EXv_PX3ZMcXxPjor6Z4qMQ_3Hh785SXYlk9mWtUbg/edit#heading=h.fm692ikh5723), I yet still have an alternative solution:
a = actorA.remote(...)
b = actorB.remote(...)
c = actorC.remote(...)
ray.rebalance(Policy.disjoint, [a, b, c])
""" later at any point in time
ray.rebalance(Policy.colocate, [a, c])
In the example above, users can create the actors as usual. Instead of initially giving hint, e.g, actorB.at(a).remote(), I think it might make much more sense to provide users with the flexibility of rebalancing previous actors at runtime with certain given policy choices. Here we provide two policies, i.e., disjoint and colocate, and they would be applied to the succeeding actor handle lists, respectively.
Pro:
- We remove the restriction of using at() in a actor-to-actor/peer-to-peer manner and introduce policies, which affects a set of actors created at initial time.
- Users could call the rebalance at any point in time upon the need of adjusting the actor performance at runtime.
Con:
- Since
ray.rebalance()is invoked after createActor tasks are issued to raylet, we might have already entered a state where actors have been deployed in nodes, before you have time to rebalance them. Thus, raylet probably needs to kill those actors, which were deployed not as we expected, and re-create/rebalance them according to the given policy.
Once #3322 is finished, ray will have the capability of moving and re-creating actors onto different nodes due to node crash and FO, it could certainly be re-used again to achieve policy-based load-balancing at runtime. The main point I think would be how to reduce the cost of shifting actors onto different nodes.
The rough design would be the following:
- Create one policy table in GCS.
ray.rebalance(Policy.colocate, [...])would actually write an entry into this policy table.- Each raylet would connect to GCS and set a callback which listens to any insertions or updates over this policy table.
- Upon a new insertion, each raylet checks whether it is related to itself. In co-locate case, each raylet might try to allocate resources for a given list of actors in an All-or-nothing manner, while in a disjoint case, the raylet, which has two ‘mutual’ actors, might kill one of them and return the task of re-creating the victim to a separate raylet with enough resource.
The reason I think it might make more sense to achieve affinity or anti-affinity at runtime, is due to the fact that you might not always have good resource utilization in a cluster. Considering the following scenario:
You might have deployed one job in a ray cluster and all the actors have been balanced at creation time, and later another job is sent to the same cluster.
After several rounds of resource balancing at creation time, some of the jobs are about to be disabled or killed, then you might have an unbalanced cluster left. Hence, it might be good to have load balancing.