[WIP] [EVENT] basic function and defination by Basasuya · Pull Request #9612 · ray-project/ray

Basasuya · 2020-07-21T12:56:35Z

Why are these changes needed?

Related issue number

Checks

I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/latest/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failure rates at https://ray-travis-tracker.herokuapp.com/.
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested (please justify below)

AmplabJenkins · 2020-07-21T12:57:34Z

Can one of the admins verify this patch?

rkooo567

I added a few comments that are for discussion!

rkooo567 · 2020-07-21T16:53:15Z

+         << Event_SourceType_Name(event.source_type()) + separator_
+         << event.source_hostname() + separator_
+         << std::to_string(event.source_pid()) + separator_
+         << Event_Severity_Name(event.severity()) + separator_


Why don't we make it like json format, so that it is easy to parse?

First, this function is a virtual function that can be inherited. BTW, If we intend to write files, The JSON format will write more bytes than the current implementation. In our internal system, the monitor backend prefers to read the separator to analyze the event file.

We should just use JSON as a standard format instead of inventing our own. This will make it easier for OSS users to integrate with and is similar to other structured logging formats (e.g., https://github.com/uber-go/zap)

We should not use the format that is specific for Ant's internal systems in OSS. It should be the format that is more generally used by the industry. (I guess you guys can have a separate implementation to use this separator).

@edoakes I think the example of zap is convinced for us. We plan to change to the JSON format.
I think there is a small question about If the user prints multiple lines for example:
RAY_EVENT(INFO, "label") << "process 1\nprocess 2\nprocess3"
How do we solve this situation, should we print multiple lines?

rkooo567 · 2020-07-21T16:54:15Z

+  return result.str();
+}
+
+void RayEventContext::SetEventContext(rpc::Event_SourceType source_type,


cc @edoakes I think it is better having more flexible map labels rather than 4 hardcoded labels. The idea is we always have global level labels per process (like these 4), and support additional global level labels or custom labels at each event.

we prepare to add a map<string, string> to store the custom labels
maybe the user can use RayEventContext::Instance()::SetCustomContext(std::string, std::string)
I will modify this week.

Yes I would much prefer to just have a map of labels. We can autopopulate a few of them but I don't see a reason to have these hardcoded.

rkooo567 · 2020-07-21T16:54:48Z

+  bool IsEmpty() { return reporter_map_.empty(); }
+
+  void Publish(rpc::Event &event) {
+    for (const auto &element : reporter_map_) {


Why do you define this in the header?

OK. I will fix it this week.

rkooo567 · 2020-07-21T16:57:37Z

+    return *this;
+  }
+
+  static void ReportEvent(std::string severity, std::string label, std::string message) {


cc @raulchen @edoakes

Is it the best practice to just send messages for events? In this case, I really cannot find differences between logs and events except that the cardinality is lower. I had some impression that we should somehow pre-define events with some description, so that we can prevent using tons of events in the repo? Idk what's the best practice in the industry, so let me know what you think about this.

ashione · 2020-07-22T02:33:54Z

        ":sha256",
        "@boost//:asio",
+        "@boost//:filesystem",
+        ":event_cc_proto",


":event_cc_proto", ":sha256", "@boost//:asio", "@boost//:asio", "@boost//:filesystem",

ashione · 2020-07-22T02:35:24Z

+  RAY_CHECK(rpc::Event_SourceType_IsValid(RayEventContext::Instance().GetSourceType()));
+  RAY_CHECK(rpc::Event_Severity_IsValid(severity_));


ashione · 2020-07-22T02:36:37Z

+
+  std::string event_id_buffer = std::string(18, ' ');
+  FillRandom(&event_id_buffer);
+  constexpr char hex[] = "0123456789abcdef";


It could reuse util function rather than duplicated implementation.

In the internal system, we use the StringToHex function. But I find there is no function in the OSS.

In the internal system, we use the StringToHex function. But I find there is no function in the OSS.

You can put these utils function in util.h as well, which makes sense for ray overall.

ashione · 2020-07-22T02:37:53Z

+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+#ifndef RAY_EVENT_H_


#pragma once

ashione · 2020-07-22T02:40:50Z

+
+class EventManager {
+ public:
+  static EventManager &Instance() {


There is no such shutdown or deconstrutor for EventManager.
We'd better use smart pointer for singleton instance so sources or somethings like could be released after process halted, and we can reset it in-fly.

use clearReporters function, we can shut down the EventManager. Maybe I should modify the clearReporters to shutdown function.

ashione · 2020-07-22T02:42:29Z

+  std::string task_id_ = "";
+  rpc::Event_SourceType source_type_ = rpc::Event_SourceType::Event_SourceType_COMMON;
+  std::string source_hostname_ = boost::asio::ip::host_name();
+  int32_t source_pid_ = getpid();


Move it to constructor

ashione · 2020-07-22T02:44:23Z

+  std::unordered_map<std::string, std::shared_ptr<LogBasedEventReporter>> reporter_map_;
+};
+
+class RayEventContext {


Make it finalized with class RayEventContext final.

ashione · 2020-07-22T02:46:04Z

+
+  inline void SetTaskID(std::string task_id) { task_id_ = task_id; }
+
+  inline std::string GetJobID() { return job_id_; }


inline const std::string& GetJobID() const { return job_id_; }

same issue in other Get/Set.

AmplabJenkins · 2020-07-22T03:51:19Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/28715/
Test FAILed.

edoakes · 2020-07-22T18:51:05Z

+  return result.str();
+}
+
+void RayEventContext::SetEventContext(rpc::Event_SourceType source_type,


Yes I would much prefer to just have a map of labels. We can autopopulate a few of them but I don't see a reason to have these hardcoded.

edoakes · 2020-07-22T18:52:19Z

+         << Event_SourceType_Name(event.source_type()) + separator_
+         << event.source_hostname() + separator_
+         << std::to_string(event.source_pid()) + separator_
+         << Event_Severity_Name(event.severity()) + separator_


We should just use JSON as a standard format instead of inventing our own. This will make it easier for OSS users to integrate with and is similar to other structured logging formats (e.g., https://github.com/uber-go/zap)

edoakes · 2020-07-22T18:53:36Z

+  inline void SetJobID(std::string job_id) { job_id_ = job_id; }
+
+  inline void SetNodeID(std::string node_id) { node_id_ = node_id; }
+
+  inline void SetTaskID(std::string task_id) { task_id_ = task_id; }
+
+  inline std::string GetJobID() { return job_id_; }
+
+  inline std::string GetNodeID() { return node_id_; }
+
+  inline std::string GetTaskID() { return task_id_; }
+
+  inline rpc::Event_SourceType GetSourceType() { return source_type_; }
+
+  inline std::string GetSourceHostname() { return source_hostname_; }
+
+  inline int32_t GetSourcePid() { return source_pid_; }


Why do we need all of these getters and setters instead of just a simple constructor?

[EVENT] basic function and defination

4c47234

rkooo567 reviewed Jul 21, 2020

View reviewed changes

ashione reviewed Jul 22, 2020

View reviewed changes

Basasuya changed the title ~~[EVENT] basic function and defination~~ [WIP] [EVENT] basic function and defination Jul 22, 2020

edoakes reviewed Jul 22, 2020

View reviewed changes

Basasuya closed this Jul 23, 2020

Basasuya mentioned this pull request Jul 25, 2020

[EVENT] Basic Function and Definition #9657

Merged

6 tasks

		RAY_CHECK(rpc::Event_SourceType_IsValid(RayEventContext::Instance().GetSourceType()));
		RAY_CHECK(rpc::Event_Severity_IsValid(severity_));


		inline void SetTaskID(std::string task_id) { task_id_ = task_id; }

		inline std::string GetJobID() { return job_id_; }

Conversation

Basasuya commented Jul 21, 2020

Why are these changes needed?

Related issue number

Checks

Uh oh!

AmplabJenkins commented Jul 21, 2020

Uh oh!

rkooo567 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rkooo567 Jul 22, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AmplabJenkins commented Jul 22, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

rkooo567 Jul 22, 2020 •

edited

Loading