Skip to content
This repository was archived by the owner on Jun 21, 2024. It is now read-only.

Ring buffer-based runtime tracing (eventring)#793

Closed
sadiqj wants to merge 46 commits intoocaml-multicore:5.00from
sadiqj:eventring_multicore
Closed

Ring buffer-based runtime tracing (eventring)#793
sadiqj wants to merge 46 commits intoocaml-multicore:5.00from
sadiqj:eventring_multicore

Conversation

@sadiqj
Copy link
Copy Markdown
Collaborator

@sadiqj sadiqj commented Dec 15, 2021

Summary

Eventring is a low-overhead runtime tracing system designed for continuous monitoring of OCaml applications. It utilises per-domain memory-mapped ring buffers that enable consumption of tracing events both in and out of process. It replaces the existing eventlog system already present in runtime.

This PR

This is a draft PR as this eventring work has been built atop of the multicore 5.00 tree. The intention here is to gather feedback on the design and implementation and then propose a rebased-PR on ocaml/ocaml once the MVP has been merged.

Ring buffer

The ring buffer is structured as a flight recorder, overwriting old data when is insufficient space to write new events. This enables users to potentially only read the ring when some anomalous event occurs or a performance target is missed. No coordination is needed with consumers who read the events and then discard events where a race with the producer is detected.

Probes

Most of the existing instrumented runtime eventlog probes are moved in to the normal runtime. Only the most frequently hit probes on very hot paths (like allocation and marking) were left in the instrumented runtime. The probes themselves are very lightweight, having almost no cost (~2ns) when eventlog has not been started or is paused. When active, a span probes takes around ~50ns to emit the begin and end events together. A benchmark to reproduce these numbers is here. Most of this cost is actually getting the current timestamp and it's possible there's some options we could add to reduce that cost for certain use cases.

API

In terms of API, there is a an OCaml API where users can create a cursor for a particular set of eventrings (the current process or an external one), register callbacks for certain types of events and then poll with the cursor.

There is also a C API which mirrors the OCaml API and enables very low impact consumption of events for those use cases that require it.

Future

This implementation is designed to be open for extension to user tracing events in the future, as well as integration with statmemprof (e.g continuous allocation profiling).

Examples

Emitting minor and major slice GC pause times to stdout

Here's a simple consumer that prints the times of minor and major slices. It omits any support for multiple domains, error handling or lost events:

let minor_started = ref Int64.zero

let runtime_begin _ ts phase =
    match phase with
    | Eventring.EV_MAJOR_SLICE ->
        begin
          major_started := Eventring.Timestamp.to_int64 ts
        end
    | Eventring.EV_MINOR ->
        begin
          minor_started := Eventring.Timestamp.to_int64 ts
        end
    | _ -> ()


let ts_to_duration_ns ts0 ts1 =
  Int64.sub (Eventring.Timestamp.to_int64 ts0) ts1

let runtime_end _ ts phase =
    match phase with
    | Eventring.EV_MAJOR_SLICE ->
        Printf.printf "Major slice: %Ldns\n"
          (ts_to_duration_ns ts !major_started)
    | Eventring.EV_MINOR ->
        Printf.printf "Minor: %Ldns\n"
          (ts_to_duration_ns ts !minor_started)
    | _ -> ()

let () =
    let list_ref = ref [] in
    let cursor = Eventring.create_cursor None in
    let callbacks = Eventring.Callbacks.create ~runtime_begin ~runtime_end ()
    in
    while true do
        for _ = 1 to 100 do
            list_ref := [];
            for _ = 1 to 10 do
                list_ref := (Sys.opaque_identity(ref 42)) :: !list_ref
            done;
            Gc.full_major ();
        done;
        ignore(Eventring.read_poll cursor callbacks None);
        Unix.sleep 1
    done

When compiled and run with this PR's branch you will get an output like:

Minor: 1496ns
Minor: 1341ns
Minor: 1308ns
Major slice: 17605ns
Minor: 1435ns
Minor: 2986ns
Minor: 2688ns
Major slice: 26577ns
Minor: 2263ns
Minor: 1098ns
Minor: 973ns
Major slice: 17719ns
...

Code such as the above could be spawned in a separate domain and send metrics data to an external service (such as graphite or prometheus) over a socket.

Chrome tracing

This is a simplified example of how Eventring can be used to log out runtime data in a format that can be simply ingested by Chrome's trace viewer. Running the above project with OCAML_EVENTRING_START=1 _build/default/src/traceevents_lib.exe and then loading the resultant json file in to Chrome's tracing viewer will give something like the following:

Example of chrome tracing outpu

Tests

The tests/eventring directory contains some other exampel of using the different APIs.

PR structure

The producer

The producer code is contained here . By default an .eventring file is created in the current directory (overridable by setting OCAML_EVENTRING_PREFIX). This file contains a ring buffer for each possible domain (Max_domains). It is laid out in a structure that enables sparsity - on-disk (or in-memory) footprint is proportional to the max number of concurrent domains the process has ever run.

The consumer internals

Consumer code is contained in eventring_consumer.c. This contains both the implementations for the OCaml API and C API (the former being a thin layer over the latter).

APIs

The OCaml API is available in Eventring.mli, the C API is available in eventring.h.

@sadiqj
Copy link
Copy Markdown
Collaborator Author

sadiqj commented Jan 11, 2022

Closing this for now, in preparation for a rebase and a PR on ocaml/ocaml.

@sadiqj sadiqj closed this Jan 11, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant