Conversation
|
Test FAILed. |
|
@Alisahhh when it's ready can you add a python example script to the PR description so others can easily know how to try it out? |
|
Sure~I will add some example when i finish python api :)
Eric Liang <notifications@github.com> 于 2020年6月13日周六 上午3:05写道:
… @Alisahhh <https://github.com/Alisahhh> when it's ready can you add a
python example script to the PR description so others can easily know how
to try it out?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#8914 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AHIO5OYUSVB7WU64Y5YJ7IDRWJ37PANCNFSM4N4KU6XA>
.
|
|
Sounds great!
Eric
…On Sun, Jun 14, 2020 at 7:07 PM Alisa ***@***.***> wrote:
Sure~I will add some example when i finish python api :)
Eric Liang ***@***.***> 于 2020年6月13日周六 上午3:05写道:
> @Alisahhh <https://github.com/Alisahhh> when it's ready can you add a
> python example script to the PR description so others can easily know how
> to try it out?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#8914 (comment)>,
or
> unsubscribe
> <
https://github.com/notifications/unsubscribe-auth/AHIO5OYUSVB7WU64Y5YJ7IDRWJ37PANCNFSM4N4KU6XA
>
> .
>
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#8914 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAADUSRGVQQXC4BVFIMZBT3RWV66DANCNFSM4N4KU6XA>
.
|
|
Test FAILed. |
|
Test FAILed. |
|
Test FAILed. |
|
Test FAILed. |
|
Test FAILed. |
ericl
left a comment
There was a problem hiding this comment.
Hey @Alisahhh one big question I have from skimming the header files is: why not represent the bundle id as std::pair<PlacementGroupId, int>? That seems simpler and ensures you can always find the placement group given the bundle identifier.
| RAY_LOG(DEBUG) << "bundle lease request " << bundle_spec.BundleId(); | ||
| auto resource_ids = ScheduleBundle(cluster_resource_map_, bundle_spec); | ||
| if (resource_ids.AvailableResources().size() == 0) { | ||
| send_reply_callback(Status::OutOfMemory("reserve resource failed"), nullptr, nullptr); |
There was a problem hiding this comment.
This should be returned as a boolean field in the proto message indicating failure, not a system-level OOM exception.
| message Bundle { | ||
| bytes bundle_id = 1; | ||
| map<string, double> unit_resources = 2; | ||
| int64 unit_count = 3; |
There was a problem hiding this comment.
Do we need unit count in the first iteration? Consider leaving it out and implementing it later if needed.
| // State of a placement group. | ||
| enum PlacementGroupState { | ||
| // Placement Group is pending or scheduling | ||
| PENDING = 0; |
There was a problem hiding this comment.
Does it means it goes back to PENDING if reconstructing?
There was a problem hiding this comment.
Yep, if a placement group creating failed, it will be added into the tail of the queue and the status will keep PENDING
| // Service for placement group info access. | ||
| service PlacementGroupInfoGcsService{ | ||
| // Create placement group via gcs service | ||
| rpc CreatePlacementGroup(CreatePlacementGroupRequest) returns (CreatePlacementGroupReply); |
There was a problem hiding this comment.
How do we check the status of a placement group? Or is that for a later PR?
There was a problem hiding this comment.
I will add the check in next pr. It looks like an actor creation in the core worker callback.
We have discussed this question last week and given multi plan. Now the bundle id will be given when the bundle creates which is earlier than the placement group creates. |
|
What if someone wants to place into any bundle in the group or some range
of bundles? That could be represented as a placement group id with no index
restriction.
I think passing around raw bundle IDs is problematic for that reason, and
also because it's lower level than placement group.
…On Sun, Jun 21, 2020, 7:55 PM Alisa ***@***.***> wrote:
Hey @Alisahhh <https://github.com/Alisahhh> one big question I have from
skimming the header files is: why not represent the bundle id as std::pair<PlacementGroupId,
int>? That seems simpler and ensures you can always find the placement
group given the bundle identifier.
We have discussed this question last week and given multi plan. Now the
bundle id will be given when the bundle creates which is earlier than the
placement group creates.
BTW, I think a placement group is a virtual concept, so the bundle should
not bound to the placement group?
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#8914 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAADUSQQIDMJQATS2OJF6BTRX3B2LANCNFSM4N4KU6XA>
.
|
Sure,
I want to check if you want I assign the BundleID like PlacementID+index, which is like the actor id has the information about task id and worker id. |
I don't think we need a "BundleID" at all. The way I think of it there are two cases here:
@Alisahhh let me know if you see any problems. |
|
@ericl are you mainly concerned about exposing BundleId to the user API? What about just using the placement group and bundle object in the API? placement_group = ray.create_placement_group([{"CPU": 1}, {"CPU": 2}]) # Type of `placement_group` is `PlacementGroup`.
bundle0 = placement_group.get_bundle(0) # Type of `bundle` is `Bundle`.
# You can pass either a `PlacementGroup` or a `Bundle` object when creating an actor.
actor1 = MyActor.options(placement=placement_group).remote()
actor2 = MyActor.options(placement=bundle0).remote() |
|
Hao, that's right. Two concerns:
1. The user API as you mentioned. I agree with the proposal you have.
2. The internal representation. I don't see a compelling reason why bundles
should exist separately as an entity independent from placement groups. If
they must be always be referred to via index, that seems cleaner to me and
less error prone (you don't need inverse mappings of bundles to groups and
so on).
…On Tue, Jun 23, 2020, 3:33 AM Hao Chen ***@***.***> wrote:
@ericl <https://github.com/ericl> are you mainly concerned about exposing
BundleId to the user API?
I agree that it's not a good idea to expose BundleId. Ideally, we
shouldn't expose PlacementGroupId either (due to the same reason why we
change ObjectId to ObjectRef).
What about just using the placement group and bundle object in the API?
placement_group = ray.create_placement_group([{"CPU": 1}, {"CPU": 2}]) # Type of `placement_group` is `PlacementGroup`. bundle0 = placement_group.get_bundle(0) # Type of `bundle` is `Bundle`.# You can pass either a `PlacementGroup` or a `Bundle` object when creating an actor.actor1 = MyActor.options(placement=placement_group).remote()actor2 = MyActor.options(placement=bundle0).remote()
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#8914 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAADUSTDWJWLNTFCXFA43G3RYCAGFANCNFSM4N4KU6XA>
.
|
|
In the future, when we have referencing counting for PlacementGroup handles, having bundles be implemented as a placement group + index also simplifies the situation. The entire placement group is referenced if any bundle is. |
Why are these changes needed?
To alloc resources in node before creating actors and store the resources map in gcs server.
Now I almost finish the cpp part.
TODO:
NOTES:
Related issue number
Checks
scripts/format.shto lint the changes in this PR.