loop,threadpool: collect statistics by jasnell · Pull Request #1764 · libuv/libuv

jasnell · 2018-02-28T16:29:23Z

This is a combined reworking of #1489 (loop stats) and #1528 (threadpool stats) to use a common API pattern. Enabling stats collection is accomplished via uv_loop_configure() for both.

The loop stats impl is simplified to remove the sampling configuration. The callback is triggered at the end of every loop tick. It's up to the receiver to figure out how often they want to pay attention to it.

Stats collection for both the loop and threadpool may be enabled/disabled dynamically after the loop is started.

pfreixes · 2018-02-28T16:52:15Z

    if ((mode == UV_RUN_ONCE && !ran_pending) || mode == UV_RUN_DEFAULT)
      timeout = uv_backend_timeout(loop);

+    uv__update_stats_ts(loop, poll_start);


is there something that does not allow us to put a stats regarding the timeout used during the current tick?

We certainly could add that... I'm curious about the use case for it but it would be trivial to add.

In case the poll time is greater than timeout this might indicate that the CPU was too busy giving the CPU time to another process. So, the timeout could be used to apply something like that:

sleep_time = min(timeout, poll_time)

So, yes it's a nice to have, but having it at hand it's too kindy to don't ask for it :)

TimothyGu · 2018-02-28T17:04:55Z


    Type definition for callback passed to :c:func:`uv_walk`.

+.. c::type:: void (*uv_stats_cb)(uv_loop_stats_data_t* stats, void* data)


One too many colons between c and type. Ditto below

jasnell · 2018-02-28T23:58:51Z

@pfreixes @TimothyGu ... both issues fixed.

pfreixes · 2018-03-01T06:38:05Z

Thanks for the work done! Good to see this feature moving forward.

cjihrig · 2018-03-03T18:53:17Z

@@ -0,0 +1,19 @@
+{


Can you remove this file please.

sigh.. oops...

A slight reworking of libuv#1528 to have a consistent API approach with loop stats

jasnell · 2018-03-04T19:02:36Z

@cjihrig ... fixed...

btw, I intend for all the commits to be squashed into one when this lands.

pfreixes · 2018-04-02T08:07:05Z

Hi libuv members - @cjihrig @saghul @bnoordhuis and so on.

I would like to use the characteristics that this PR introduces, allowing the developers to make some profiling and getting statistics of the loop performance.

So, basically my question is: which are the plans for the v2 and specifically for that PR?

Thanks,

santigimeno · 2018-04-02T10:51:52Z

So, basically my question is: which are the plans for the v2 and specifically for that PR?

I would say that, most probably, v2 is not going to happen soon. See: #1597.

jasnell · 2018-04-02T13:48:33Z

Would definitely like to know what else this needs before it can land. I'd definitely like to make sure it's part of 2.0, even if 2.0 is going to take a while.

jasnell · 2018-04-19T00:47:25Z

Ping @cjihrig @saghul @bnoordhuis ... is there anything else needed to get this landed?

cjihrig · 2018-04-19T00:58:35Z

I haven't forgotten about this. I've been removing and landing things from the v2 PR backlog. It's down to 8 (with 2 ready to land very soon) from 14 two weeks ago. I've been trying to prioritize the things that are several years old already 😄

jasnell · 2018-04-19T01:08:59Z

Awesome. I appreciate the work you are doing on getting things caught up. Just wanted to make sure there wasn't something else blocking this :)

bnoordhuis · 2018-04-19T10:06:02Z

I've been thinking: since Node.js uses Chromium's trace_events now, wouldn't it make sense to start using that in libuv too? I mean, this PR is mostly fine but ultimately it's to feed data into trace_events in a roundabout way.

jasnell · 2018-04-19T12:07:27Z

That's the plan here, to be honest. Once integrated into node.js, I'll have a PR that kicks this data out to both the trace events stream and out to the perf API.

I suppose we could integrate the trace event macros directly into libuv but that would add a dependency that's really not strictly necessary and we would need to decide if it makes sense to pull in the complete machinery for outputting the trace event file.

pfreixes · 2018-04-19T12:20:55Z

I can not speak on their behalf but I guess that other libuv dependencies such as gevent, uvloop, u others will feel more comfortable working with the raw events provided by this MR.

bnoordhuis · 2018-04-19T15:49:09Z

need to decide if it makes sense to pull in the complete machinery for outputting the trace event file

Libuv is a library so no, that would be the responsibility of the embedder. Libuv just needs to make the events available in a way that's easily consumable.

other libuv dependencies such as gevent, uvloop, u others will feel more comfortable working with the raw events provided by this MR.

MR = PR? trace_events is in many ways 'rawer' than what this PR provides.

jasnell · 2018-04-19T15:54:28Z

Well, I think either way we'd end up in roughly the same place but we could take an approach of invoking a callback at each collection point rather than setting a timestamp in a struct. Either way works for me, I'm more interested in getting the data out somehow than in any one specific way :-)

jasnell · 2018-04-19T15:55:09Z

In other words, I'm happy to rework this PR in any way you think would be better @bnoordhuis :-)

pfreixes · 2018-04-19T16:01:21Z

Is not libuv a library used by many other technologies? Changes like that would suppose force these tecnologies to use a specific dependency to make usage of the tracing system?

If this is the case, I would prefer to keep the current aproach.

bnoordhuis · 2018-04-19T16:10:43Z

@cjihrig @santigimeno @saghul WDYT?

cjihrig · 2018-04-19T16:23:38Z

What exactly are the options on the table? I'm not really keen on the idea of integrating directly with trace_events or anything else from the Chromium universe if it can be avoided.

santigimeno · 2018-04-19T16:25:19Z

TBH I'm not familiar with the trace_events at all and the current proposal looks pretty generic which I like. Are there other advantages to of using trace_events apart from easier integration with node?

jasnell · 2018-04-19T16:26:53Z

If I'm understanding @bnoordhuis correctly, the options on the table are:

For the loop stats:

Collect timestamps in a struct as the tick proceeds, invoke a callback once at the end of the tick (which is what this PR does)
Invoke a callback at each collection point (rather than grab the timestamp).

The difference between the two is that 1 requires a storage struct and a single callback per tick, where as 2 means we don't need to store timestamps but have multiple callbacks per tick.

For the threadpool stats, that would remain the same as implemented here.

santigimeno · 2018-04-19T16:31:17Z

@jasnell thanks for the explanation. Could you elaborate a little where trace_events come into play?

jasnell · 2018-04-19T16:31:18Z

The only actual trace events integration here would be on the embedder-side. E.g. Node.js would register callbacks that would emit v8 trace events... Which we can easily do with either of the two approaches. 1 records the events after they happen at the end of the tick, 2 records the events as they happen. 2 would have the advantage of at least recording partial information should the process exit in the middle of the tick.

santigimeno · 2018-04-19T16:39:37Z

@jasnell thanks.

I think I like option 2 better.

bnoordhuis · 2018-04-19T16:42:53Z

Another advantage of trace_events is that we won't have to worry about ABI. An issue with this PR is that we can't easily add or change events without touching the struct.

jasnell · 2018-04-19T16:45:01Z

Works for me. I'll get this reworked to drop the struct and use the 2nd approach.

pfreixes · 2018-04-19T16:59:05Z

So, ita just matter of change the pattern? Calling the callbacks at each event? If this is the case it wil suit for my proposals too

jasnell · 2018-04-20T16:10:04Z

@bnoordhuis @santigimeno @saghul @cjihrig @evanlucas ....

Ok, so what I'm going to do is open a second PR that uses a more trace events like approach as discussed above. Rather than collecting timestamps into a struct it will invoke a trace callback as the loop progresses.

Before I get too far into the implementation, I want to solicit some opinions on the API because there are a couple ways I could do it.

Option 1: One generic trace callback

Internal function:

uv__trace(uv_loop_t* loop, uv_trace_type type, const char* name, ...);

In the loop:

for (...) { // in uv_run
  // ...
  uv__trace(loop, UV_TRACE_TYPE_START, "tick");

  uv__trace(loop, UV_TRACE_TYPE_START, "timers");
  count = uv_run_timers(...)
  uv__trace(loop, UV_TRACE_TYPE_END, "timers", "count", count);

  uv__trace(loop, UV_TRACE_TYPE_END, "tick");
  // ...
}

On the embedder side:

static void trace_cb(uv_trace_type* type, const char* name, void* data, va_list) {
  // ...
}

// ...
uv_trace_t config;
config.cb = trace_cb;
config.data = &loop;

uv_loop_config(&loop, UV_LOOP_TRACE, &config);
uv_run(...);

The key bit with this option is that a single callback is configured. Notice, however, the use of va_list in the callback signature. This mimics the V8 Trace Events macros that allow an arbitrary number of additional named arguments to be passed through. Unfortunately, doing it this way means the embedder will need to know how many named arguments and the types associated with those based on the name and type of the event being emitted. It would not be obvious in the API.

Therefore we have option 2: event-specific callbacks

On the libuv side:

for (...) { // in uv_run
  uv_trace_start(loop, UV_TRACE_PHASE_TICK);

  uv_trace_start(loop, UV_TRACE_PHASE_TIMERS);
  count = uv_run_timers(...);
  uv_trace_end(loop, UV_TRACE_PHASE_TIMERS, count);
  // ...
  uv_trace_end(loop, UV_TRACE_PHASE_TICK);
}

On the embedder side:

static void trace_start_cb(uv_trace_phase phase, void* data) {}
static void trace_end_cb(uv_trace_phase phase, size_t count, void* data) {}

uv_trace_t config;
config.start_cb = trace_start_cb;
config.end_cb = trace_end_cb;
config.data = &loop;

With this approach, the API is less ambiguous, more rigid, but also less flexible in terms of what data is passed through.

Which approach would each of you prefer?

jasnell · 2018-04-20T16:58:42Z

There are other ways of doing this, of course... but the key question is how we should handle the arbitrary additional bits of information that should be included in the trace call. The poll event, for instance, should include the timeout, the various check|prepare|idle|timer end events should include the count, etc. If we go too rigid with it, then it would break ABI if we decide we need to add more info later. If we go too loose with it, the API becomes ambiguous.

jasnell · 2018-04-23T21:15:02Z

Ok, see #1815 for an alternative take based around a trace events model.

jasnell · 2018-09-19T19:45:27Z

Closing in favor of #1815

This was referenced Feb 28, 2018

loop: add uv_loop_stats #1489

Closed

[Tracking] loop/threadpool stats from libuv nodejs/node#19063

Closed

pfreixes reviewed Feb 28, 2018

View reviewed changes

TimothyGu reviewed Feb 28, 2018

View reviewed changes

jasnell force-pushed the loop-stats-redo branch from 1fe096f to fe7092b Compare February 28, 2018 23:33

cjihrig reviewed Mar 3, 2018

View reviewed changes

jasnell added 6 commits March 4, 2018 11:01

loop: add loop stats api

f724fae

threadpool: threadpool stats

897d1ea

A slight reworking of libuv#1528 to have a consistent API approach with loop stats

[Squash] fix nit

dbbdba1

[Squash] Include timeout in loop stats

fc0544a

[Squash] Fix on windows

046101c

[Squash] oopsie

7b95c7d

jasnell force-pushed the loop-stats-redo branch from ae444a4 to 7b95c7d Compare March 4, 2018 19:01

gireeshpunathil mentioned this pull request Mar 6, 2018

Get Length of callback queue nodejs/node#19158

Closed

jasnell mentioned this pull request Mar 13, 2018

[trace_event] adding trace points to core nodejs/diagnostics#82

Closed

2 tasks

cjihrig added the v2 label Mar 16, 2018

jasnell mentioned this pull request Apr 23, 2018

loop,threadpool: trace events #1815

Closed

davisjam mentioned this pull request Aug 27, 2018

Threadpool meta-issue #1959

Closed

jasnell closed this Sep 19, 2018


		Type definition for callback passed to :c:func:`uv_walk`.

		.. c::type:: void (uv_stats_cb)(uv_loop_stats_data_t stats, void* data)

Conversation

jasnell commented Feb 28, 2018

Uh oh!

pfreixes Feb 28, 2018

Choose a reason for hiding this comment

Uh oh!

jasnell Feb 28, 2018

Choose a reason for hiding this comment

Uh oh!

pfreixes Feb 28, 2018

Choose a reason for hiding this comment

Uh oh!

TimothyGu Feb 28, 2018

Choose a reason for hiding this comment

Uh oh!

jasnell commented Feb 28, 2018

Uh oh!

pfreixes commented Mar 1, 2018

Uh oh!

cjihrig Mar 3, 2018

Choose a reason for hiding this comment

Uh oh!

jasnell Mar 4, 2018

Choose a reason for hiding this comment

Uh oh!

jasnell commented Mar 4, 2018

Uh oh!

pfreixes commented Apr 2, 2018

Uh oh!

santigimeno commented Apr 2, 2018

Uh oh!

jasnell commented Apr 2, 2018

Uh oh!

jasnell commented Apr 19, 2018

Uh oh!

cjihrig commented Apr 19, 2018

Uh oh!

jasnell commented Apr 19, 2018

Uh oh!

bnoordhuis commented Apr 19, 2018

Uh oh!

jasnell commented Apr 19, 2018

Uh oh!

pfreixes commented Apr 19, 2018

Uh oh!

bnoordhuis commented Apr 19, 2018

Uh oh!

jasnell commented Apr 19, 2018

Uh oh!

jasnell commented Apr 19, 2018

Uh oh!

pfreixes commented Apr 19, 2018

Uh oh!

bnoordhuis commented Apr 19, 2018

Uh oh!

cjihrig commented Apr 19, 2018

Uh oh!

santigimeno commented Apr 19, 2018

Uh oh!

jasnell commented Apr 19, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

santigimeno commented Apr 19, 2018

Uh oh!

jasnell commented Apr 19, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

santigimeno commented Apr 19, 2018

Uh oh!

bnoordhuis commented Apr 19, 2018

Uh oh!

jasnell commented Apr 19, 2018

Uh oh!

pfreixes commented Apr 19, 2018

Uh oh!

jasnell commented Apr 20, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jasnell commented Apr 20, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

jasnell commented Apr 19, 2018 •

edited

Loading

jasnell commented Apr 19, 2018 •

edited

Loading

jasnell commented Apr 20, 2018 •

edited

Loading

jasnell commented Apr 20, 2018 •

edited

Loading