-
Notifications
You must be signed in to change notification settings - Fork 452
Description
When a thread is running on the CPU, we sample it at a certain rate, and we record a timestamp and a stack for each sample.
When a thread is not running on the CPU, we still want to know where it's spending its time, i.e. what stack it is blocked in and for how long.
We currently don't have a good way to represent the latter efficiently.
A thread can be blocked for a long time. At the moment, we sample it at a regular rate during the blocked state, and we put each sample into the profile individually, including each sample's timestamp. But those timestamps take up a lot of space in the JSON and are not very interesting. We really only want to know roughly when the thread started blocking and when it stopped - we want a start and stop timestamp, and a sample count.
As for CPU deltas, the situation is the following: The first sample in an "blocked time range" usually has some greater than zero CPU delta. All the following samples have a CPU delta of zero.
With today's processed profile format, we can try to use of the weight column. We can pick the meaning of the sample weight by specifying a weightType: It can be "samples", "tracing-ms", or "bytes". So we would specify "samples", so that a single sample with weight N gets counted as N samples.
But then what do we do with the sample timestamp? And where do we put the CPU delta of the first sample?
Here's a solution that works with today's format: We can have two samples per off-CPU range. One sample with weight 1 at the start of the range, which carries the non-zero leftover CPU delta, and one sample with weight N at the end of the range, with CPU delta zero; N being the number of samples that were collapsed into this end-of-range sample.
And here's an example profile with that solution.
It works ok, but there are two problems with this solution:
- The "sample timestamp" graph with the blue dots has gaps. These gaps make it look like we were not able to sample. But in fact we were able to sample just fine during that duration; we just didn't emit a sample with that timestamp. And, luckily, clicking in those gaps selects the correct stack. That's because those clicks go between the "start sample" and the "end sample" of the blocked duration, and both of those samples have the same stack (the blocked stack), and clicking in the space between these samples will target one of those samples.
- When you select a time range, the numbers in the call tree and flame graph "jump" as you drag past the weighted samples. The "blocked range" is included in the selection on an all-or-nothing basis: Either its full weight is included, or nothing. This is not good. If I select half of the blocked range, I would like the number in the call tree to be the sample count of half of the blocked range.
I'd like to find a solution which can do the following:
- In the "sample timestamp" graph, draw wide rectangles for "long samples".
- When selecting a part of the "long sample", the weight in the call tree / flame graph should be proportional to the selected fraction.
- The call tree and flame graph should still display sample counts, not precise durations. After all, we are still dealing with sampling data, not with tracing data. (This disqualifies using
weightType: "tracing-ms".)
I think these requirements call for a different format to represent this data.
┆Issue is synchronized with this Jira Task