Skip to content

[WIP] [EVENT] basic function and defination#9612

Closed
Basasuya wants to merge 1 commit intoray-project:masterfrom
Basasuya:master
Closed

[WIP] [EVENT] basic function and defination#9612
Basasuya wants to merge 1 commit intoray-project:masterfrom
Basasuya:master

Conversation

@Basasuya
Copy link
Copy Markdown
Contributor

Why are these changes needed?

Related issue number

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/latest/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failure rates at https://ray-travis-tracker.herokuapp.com/.
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested (please justify below)

@AmplabJenkins
Copy link
Copy Markdown

Can one of the admins verify this patch?

Copy link
Copy Markdown
Contributor

@rkooo567 rkooo567 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a few comments that are for discussion!

Comment thread src/ray/util/event.cc
<< Event_SourceType_Name(event.source_type()) + separator_
<< event.source_hostname() + separator_
<< std::to_string(event.source_pid()) + separator_
<< Event_Severity_Name(event.severity()) + separator_
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we make it like json format, so that it is easy to parse?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, this function is a virtual function that can be inherited. BTW, If we intend to write files, The JSON format will write more bytes than the current implementation. In our internal system, the monitor backend prefers to read the separator to analyze the event file.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should just use JSON as a standard format instead of inventing our own. This will make it easier for OSS users to integrate with and is similar to other structured logging formats (e.g., https://github.com/uber-go/zap)

Copy link
Copy Markdown
Contributor

@rkooo567 rkooo567 Jul 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not use the format that is specific for Ant's internal systems in OSS. It should be the format that is more generally used by the industry. (I guess you guys can have a separate implementation to use this separator).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@edoakes I think the example of zap is convinced for us. We plan to change to the JSON format.
I think there is a small question about If the user prints multiple lines for example:
RAY_EVENT(INFO, "label") << "process 1\nprocess 2\nprocess3"
How do we solve this situation, should we print multiple lines?

Comment thread src/ray/util/event.cc
return result.str();
}

void RayEventContext::SetEventContext(rpc::Event_SourceType source_type,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @edoakes I think it is better having more flexible map labels rather than 4 hardcoded labels. The idea is we always have global level labels per process (like these 4), and support additional global level labels or custom labels at each event.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we prepare to add a map<string, string> to store the custom labels
maybe the user can use RayEventContext::Instance()::SetCustomContext(std::string, std::string)
I will modify this week.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I would much prefer to just have a map of labels. We can autopopulate a few of them but I don't see a reason to have these hardcoded.

Comment thread src/ray/util/event.h
bool IsEmpty() { return reporter_map_.empty(); }

void Publish(rpc::Event &event) {
for (const auto &element : reporter_map_) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you define this in the header?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I will fix it this week.

Comment thread src/ray/util/event.h
return *this;
}

static void ReportEvent(std::string severity, std::string label, std::string message) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @raulchen @edoakes

Is it the best practice to just send messages for events? In this case, I really cannot find differences between logs and events except that the cardinality is lower. I had some impression that we should somehow pre-define events with some description, so that we can prevent using tons of events in the repo? Idk what's the best practice in the industry, so let me know what you think about this.

Comment thread BUILD.bazel
Comment on lines 1408 to +1411
":sha256",
"@boost//:asio",
"@boost//:filesystem",
":event_cc_proto",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

     ":event_cc_proto",
     ":sha256",
    "@boost//:asio",	        
    "@boost//:asio",
    "@boost//:filesystem",

Comment thread src/ray/util/event.cc
Comment on lines +73 to +74
RAY_CHECK(rpc::Event_SourceType_IsValid(RayEventContext::Instance().GetSourceType()));
RAY_CHECK(rpc::Event_Severity_IsValid(severity_));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lint

Comment thread src/ray/util/event.cc

std::string event_id_buffer = std::string(18, ' ');
FillRandom(&event_id_buffer);
constexpr char hex[] = "0123456789abcdef";
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could reuse util function rather than duplicated implementation.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the internal system, we use the StringToHex function. But I find there is no function in the OSS.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the internal system, we use the StringToHex function. But I find there is no function in the OSS.

You can put these utils function in util.h as well, which makes sense for ray overall.

Comment thread src/ray/util/event.h
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#ifndef RAY_EVENT_H_
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#pragma once

Comment thread src/ray/util/event.h

class EventManager {
public:
static EventManager &Instance() {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no such shutdown or deconstrutor for EventManager.
We'd better use smart pointer for singleton instance so sources or somethings like could be released after process halted, and we can reset it in-fly.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use clearReporters function, we can shut down the EventManager. Maybe I should modify the clearReporters to shutdown function.

Comment thread src/ray/util/event.h
std::string task_id_ = "";
rpc::Event_SourceType source_type_ = rpc::Event_SourceType::Event_SourceType_COMMON;
std::string source_hostname_ = boost::asio::ip::host_name();
int32_t source_pid_ = getpid();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move it to constructor

Comment thread src/ray/util/event.h
std::unordered_map<std::string, std::shared_ptr<LogBasedEventReporter>> reporter_map_;
};

class RayEventContext {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make it finalized with class RayEventContext final.

Comment thread src/ray/util/event.h

inline void SetTaskID(std::string task_id) { task_id_ = task_id; }

inline std::string GetJobID() { return job_id_; }
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inline const std::string& GetJobID() const { return job_id_; }

same issue in other Get/Set.

@AmplabJenkins
Copy link
Copy Markdown

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/28715/
Test FAILed.

@Basasuya Basasuya changed the title [EVENT] basic function and defination [WIP] [EVENT] basic function and defination Jul 22, 2020
Comment thread src/ray/util/event.cc
return result.str();
}

void RayEventContext::SetEventContext(rpc::Event_SourceType source_type,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I would much prefer to just have a map of labels. We can autopopulate a few of them but I don't see a reason to have these hardcoded.

Comment thread src/ray/util/event.cc
<< Event_SourceType_Name(event.source_type()) + separator_
<< event.source_hostname() + separator_
<< std::to_string(event.source_pid()) + separator_
<< Event_Severity_Name(event.severity()) + separator_
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should just use JSON as a standard format instead of inventing our own. This will make it easier for OSS users to integrate with and is similar to other structured logging formats (e.g., https://github.com/uber-go/zap)

Comment thread src/ray/util/event.h
Comment on lines +104 to +120
inline void SetJobID(std::string job_id) { job_id_ = job_id; }

inline void SetNodeID(std::string node_id) { node_id_ = node_id; }

inline void SetTaskID(std::string task_id) { task_id_ = task_id; }

inline std::string GetJobID() { return job_id_; }

inline std::string GetNodeID() { return node_id_; }

inline std::string GetTaskID() { return task_id_; }

inline rpc::Event_SourceType GetSourceType() { return source_type_; }

inline std::string GetSourceHostname() { return source_hostname_; }

inline int32_t GetSourcePid() { return source_pid_; }
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need all of these getters and setters instead of just a simple constructor?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants