[EVENT] Basic Function and Definition#9657
Conversation
|
Can one of the admins verify this patch? |
|
Test FAILed. |
|
Some questions.
|
#9612 (comment) #9612 (comment) #9612 (comment) |
fa08d95 to
63e145f
Compare
|
Test FAILed. |
|
Test FAILed. |
|
@Basasuya I meant address all comments in the previous PR! (so that Edward and Lingxuan doesn't need to have the same review). I just would like to have your confirmation, so that we can resume review after that (or reply here if you have questions). Also, thanks for your responses to my msgs! |
It was more like a question (not really a review). I am not super familiar with events, so I am not sure if my concern is valid. Also, I generally agree with the whole structures. It is just about how we record events. Here was my thought process. When I see description of what events are about, it gives me impression that events should
And it makes me feel like events should be pre-defined with description/schema. For example, when actors are created, we only need these events. Adding more is probably unnecessary. ACTOR_CREATION(description="Actor creation process")
ACTOR REGISTRATION INITIATED (parent=ACTOR CREATION, sev=INFO, description="events that indicate actor registration process is initiated", actor_id="123")
DEPENDECY_RESOLVED (parent=ACTOR CREATION, sev=INFO, actor_id)
SCHEDULING REQUSET TO RAYLET (parent=ACTOR CREATION, sev=INFO, actor_id)
SCHEDULING CONFIRMED (parent=ACTOR CREATION, sev=INFO, actor_id)
CREATING ACTOR TO NODE (parent=ACTOR CREATION, sev=INFO, actor_id)
ACTOR CREATION FAILED (parent=ACTOR CREATION, sev=ERROR, actor_id)
ACTOR REGISTRATION DONE (parent=ACTOR CREATION, sev=INFO, actor_id)
// Sorry the syntax is weird, but it is how I imagine to use this.
EVENT(DEPENDECY_RESOLVED(actor_id="")) << "msg"Here are some problems I can imagine with the current APIs.
|
I don't know how to release all the reviews in the previous PR = =. Do you mean just copy these links? LingXuan's Edward's Sangbin's |
|
I think this question is about whether we need to make the event more structured. I think, first of all, the event call should be buried in the ray core carefully, not just like LOG. If you need to create a high-level API like ACTOR_CREATION, DEPENDECY_RESOLVED, and so on. it should be defined by the core_worker itself, This can be achieved by designing another layer above this event framework. Second, users may abuse this API, I think some tags can be set to distinguish these events (such as label). We can also use a method like the LOG severity level. |
63e145f to
db7e2b5
Compare
|
Test FAILed. |
|
@Basasuya I agree we can build that abstraction in the higher layer. Also, I didn't mean to port those comments to here. I just meant "please make sure to address those comments in this PR!" |
rkooo567
left a comment
There was a problem hiding this comment.
I will leave more comments later tomorrow! Also, please add newlines at the end of the file!
|
Test FAILed. |
|
@Basasuya IIRC the previous PR was just to show the concept/API for this. Should I review this one assuming we're actually going to merge it? |
Yes, we should discuss on this PR. This PR involves the basic level design of the event framework. |
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
|
Test FAILed. |
|
Test FAILed. |
|
Test FAILed. |
|
Test FAILed. |
|
Test FAILed. |
|
There's a lint failure. |
|
Test FAILed. |
|
Test FAILed. |
fixed |
This PR is moved from the https://github.com/ray-project/ray/pull/9612/commits
The previous PR is closed
Why are these changes needed?
Ray needs an event framework to report critical information, this information includes application-level data like job-condition, node adds/remove, alert, error, and so on. event messages can be sent to different destinations such as the dashboard server, the file. Users can see this information in real-time, or analyze the historical information after the program ends.
Related issue number
Checks
scripts/format.shto lint the changes in this PR.