USDT tools for higher level languages by goldshtn · Pull Request #774 · iovisor/bcc

goldshtn · 2016-10-20T12:16:28Z

This is a work-in-progress: please do not merge yet. I'm putting it out there for review and feedback at this time. Some of the tools are not done yet, and none have example and man files at this time.

Inspired by tools from the DTrace toolkit, here are a couple of tools for high-level languages (currently Java, Python, and Ruby) using their USDT probes when available.

The tools I have planned are:

ucalls (in this PR) -- summarize method call numbers and total latency
ugc (in this PR) -- monitor and trace GC events
uflow (planned) -- print method call flow
ustat (planned) -- print top-like statistics on method calls, class loads, GC events, and so on

On top of these, profile should already work with languages that can emit /tmp/perf-PID.map files by resolving stack traces, so there's probably no need for a custom uprofile.

I think it should be possible to avoid forking each tool for each language, but rather keep a single set of tools with a language parameter. Examples (some of this should already work at this patchset):

# top 10 slowest Python methods, times in milliseconds
ucalls python 148 -mLT 10

# Java GC events
ugc java 8004

# print flow graph for java.io methods
uflow java 8004 java.io.*

# print stats on method calls, object creation, GC events, etc. for Ruby process 1103
ustat ruby 1103

Any thoughts or feedback on tool naming, options, and so on would be appreciated. I am mostly trying to mimic the per-language tools from the DTrace toolkit, while recognizing that some things are not currently/yet possible (e.g. syscall tracing in addition to method tracing).

/cc @brendangregg

brendangregg · 2016-10-20T22:28:18Z

Some general comments before I get into the tools...

Not a lot was done with USDT in runtimes. Here's the three problems:

GC tracing: via USDT works great, but most runtimes already have a GC log, and those logs can have a ton more detail. There may be times when we want a bcc tool to do it, but just remember that it won't be the primary tool many people use.
Function tracing: The overhead of int 3 trapping made these fairly prohibitive (@4ast might fix it one day :). In the DTraceToolkit, I did some awesome looking tools with them, and they were useful for debugging, but high overhead for production. I should have warned about overhead more, and we can here in bcc, so people know they are a function-equivalent of running strace!
Sun was on the way out: There's more we can do with USDT probes in runtimes, like looking into memory allocation in more detail, but around the time we should have been doing those kinds of things, people were switching to Linux. So I think there's untapped opportunity here, which we'll discover now that USDT can be a thing on Linux, and we can go back to adding more runtime USDT probes.

Java had some interesting ones: like class allocations, thread creations. I think those interest me the most. Sure, I'll take GC scripts and function tracing scripts like existed in the DTraceToolkit, but the latter are more debug-only until we lower overheads.

Another interesting use is mixing USDT probes with other user and kernel events, which is something other analyzers can't do. You can imagine doing this for MySQL USDT probes: casting kernel events in the context of a query. Need to think about opportunities in runtimes too.

As for naming: I'd go with javagc, javacalls, javaclassstat, nodegc, nodecalls, pythoncalls, etc. So we can list java* for Java developers. Findability. Now if the tools are so close, then symlinks or light 2-line shell wrappers to a common multi-tool?

You know what I'd do: I'd start with separate files for each, and consider refactoring later. Think about this: if a Java developer wants to enhance separate Java tools, they can go to town. But if they are all multi-tools, then we'll be telling them: please test your changes with Node.js, Python, etc, etc. I can imagine how that'd go!

goldshtn · 2016-10-21T06:43:27Z

Thanks for the feedback! I realize the method tracing tools are going to be for debug use, mostly. Re GC stats, Java sure has a slew of logs, but for other runtimes it can be quite useful. For example, a quick search reveals that Node has a --trace-gc and no one understands its output. In Ruby, you need to add a gem that collects GC statistics. And so on.

Re kernel events in the context of a query: this can be done today with trace. The tool is harder to use, sure, but it's general-purpose. I don't know if we can anticipate exactly which kernel events people would want along with the USDT traces. Or maybe we will be able to, in the future, when BPF + USDT get more traction. What I really would like is syscall support, and as you recall (#748) we don't have that yet. If there's anything we can do to move that forward, I'd love to help.

brendangregg · 2016-10-21T07:15:22Z

Yes, good points about Node.js and Ruby.

There's a lot of languages out there. I've been thinking about it, and maybe it does make sense to begin with the multi-tools (and multiple symlinks/shell scripts to them). Because: A) accommodating Java developers with individual scripts is a nice idea, but what's the chance none contribute and we end up doing the work anyway?, and B) bcc tools are long. With DTrace, it was no big deal to have 4 tools that had 95% common code, when the tools were 20 lines each. With bcc, we have 100-300 line tools, so duplicating that is 10x more work to maintain.

goldshtn · 2016-10-26T05:28:54Z

@brendangregg So I've added a couple of tools and changed some of the tools I pushed originally. What we have right now (not 100% done, and no documentation yet, so not mergeable):

ugc for tracing GC events (Java, Node, Python, Ruby)
ucalls for tracing method calls and/or syscall stats (Java, Python, Ruby, syscalls-only)
uthreads for tracing thread start/stop events (Java, pthreads)
uobjnew for summarizing object allocations by type (Java, Ruby, malloc)

The last remaining tool I haven't touched yet is uflow, which is going to be super-crazy-high overhead but probably still useful as a light debugger replacement for tracing method flow.

I was also thinking of building systrace -- an strace equivalent that uses BPF. Don't know if that'd be interesting or useful.

brendangregg · 2016-10-26T07:18:37Z

May as well do a quick systrace. I did dtruss once, just to show DTrace could do syscalls. Its only feature over Solaris truss was being able to match multiple process names with -n, instead of one PID only. Next thing dtruss is the only syscall tracer for OS X.

I say quick, as I suspect strace has a lot of translations in that would be redundant to copy and hard to maintain. So I wouldn't invest too much time in the experiment.

What would be also useful (in addition) would be a syscount, like my https://github.com/brendangregg/perf-tools/blob/master/examples/syscount_example.txt . Yes, that's really a more featured version of strace -c. I'd be inclined not to add it to systrace, since that would be mixing a per-event output tool with a summary tool (I know strace has a precedent for that, but I'm not sure it's a good precedent). I like the argdist / trace split.

Your choice of USDT tools makes sense. I still don't know whether these should all become symlinks (java_, node_) or just keep them as u*, or both.

I took a quick look at the code -- maybe because they are new you just haven't done this yet, but I'd have put big comments above each language block. Like:

    probes.append(Probe("gc__start", "gc__done",
                        begin_save, end_save, formatter))
#
# Ruby
#
elif args.language == "ruby":
    # Ruby GC probes do not have any additional information available.
    probes.append(Probe("gc__mark__begin", "gc__mark__end",
                        "", "", lambda _: "GC mark stage"))
    probes.append(Probe("gc__sweep__begin", "gc__sweep__end",
                        "", "", lambda _: "GC sweep stage"))
#
# Node.js
#
elif args.language == "node":
    end_save = """

goldshtn · 2016-10-26T08:52:56Z

syscount is pretty much there with ucalls -S, except I built the tool to require a pid (this can be changed specifically for -S); or with funccount -r '^[Ss]y[Ss]_*'. I guess it won't hurt to have a separate tool with some better grouping/filtering/sorting options like your syscount from perf-tools.

Re systrace, I didn't mean to provide the same level of detail (with the fd parsing, string extraction and so on). Just a basic trace of each syscall with 6 arguments. More advanced parsing would be possible with trace.

goldshtn · 2016-10-27T14:22:24Z

Just forced-push after rebasing. Added a new tool, ustat (which I forgot about) that prints high-level stats on GCs, method calls, object allocations, and so on -- in all high-level processes it can put its hands on.

Example output:

# ustat -C 5 3
Tracing... Output every 5 secs. Hit Ctrl-C to end                               

 06:44:43 loadavg: 0.07 0.23 0.15 5/463 19945                                    

 PID    CMDLINE          METHOD/s   GC/s   OBJNEW/s   CLOAD/s  EXCP/s   THREAD/s 
 16109  /home/vagrant/no 0          5      0          0        0        0        
 17595  irb              0          0      0          0        0        0
 17815  ./python         17995      0      0          0        0        0       
 19928  java -XX:+Extend 0          0      0          0        0        0       

 06:44:49 loadavg: 0.15 0.24 0.15 1/464 19947                                    

 PID    CMDLINE          METHOD/s   GC/s   OBJNEW/s   CLOAD/s  EXCP/s   THREAD/s 
 16109  /home/vagrant/no 0          0      0          0        0        0        
 17595  irb              130        1      109        0        0        0
 17815  ./python         0          0      0          0        0        0       
 19928  java -XX:+Extend 0          0      0          0        0        0       

 06:44:54 loadavg: 0.28 0.27 0.16 3/465 19948                                    

 PID    CMDLINE          METHOD/s   GC/s   OBJNEW/s   CLOAD/s  EXCP/s   THREAD/s 
 16109  /home/vagrant/no 0          0      0          0        0        0        
 17595  irb              0          0      0          0        0        0
 17815  ./python         0          0      0          0        0        0       
 19928  java -XX:+Extend 762207     0      0          0        0        0

Also, added uflow for flow tracing.

goldshtn · 2016-11-07T22:42:36Z

Added man pages and example files. This is ready for review and merging. /cc @brendangregg

dark · 2016-11-08T23:17:39Z

@goldshtn I tried ustat because it looks very interesting. Thanks!
I have noticed one thing I'd like to report, though: it looks like ustat leaks file descriptors for bpf-map, and eventually errors out with error: could not open bpf map: Too many open files.
By following along with lsof I noticed a leak of around 20-30 bpf-map FDs per second.

Maybe it's a problem of my setups? My two machines have oldish kernels (4.4.26 and 4.7.4) and no packaged configured with --enable-dtrace, so there is no activity to report actually. I will try with dtrace-enabled software sometime in the future.

goldshtn · 2016-11-09T11:21:01Z

@dark Thanks, it's leaking indeed on my 4.9-rc as well. I'll track these fds down. I'm explicitly cleaning up the BPF object, but something must be sticking around. I believe this is the only tool we have that reloads the BPF program with new maps instead of just initializing everything upfront, which is probably why we have only now run into it.

goldshtn · 2016-11-09T19:25:44Z

@dark Should be fixed by #808. When it gets merged, I'll rebase.

goldshtn · 2016-11-09T20:10:25Z

Rebased. We should get this merged at some point :)

brendangregg · 2016-11-18T23:21:27Z

I'm back... I like the tools. They seem to work. :)

uobjnew.py is missing a PID filter, at least for the "c" match (the following python program is idle):

# ./uobjnew.py c `pgrep -n python` -v
#include <uapi/linux/ptrace.h>


#include <linux/ptrace.h>

struct key_t {
#if 1
    u64 size;
#else
    char name[50];
#endif
};

struct val_t {
    u64 total_size;
    u64 num_allocs;
};

BPF_HASH(allocs, struct key_t, struct val_t);

int alloc_entry(struct pt_regs *ctx, size_t size) {
    struct key_t key = {};
    struct val_t *valp, zero = {};
    key.size = size;
    valp = allocs.lookup_or_init(&key, &zero);
    valp->total_size += size;
    valp->num_allocs += 1;
    return 0;
}

Tracing allocations in process 26230 (language: c)... Ctrl-C to quit.
^C
TYPE                           # ALLOCS      # BYTES
block size 0                         32            0
block size 50                         8          400
block size 6                         76          456
block size 57                         8          456
[...]
block size 224                    15411      3452064
block size 552                     7162      3953424
block size 32816                    242      7941472
block size 4097                    4992     20452224

uobjnew examples have a "-l", but that should be omitted (will need to update the tool and example file) (update: read ahead as I have a different suggestion about -l):

examples:
    ./uobjnew -l java 145         # summarize Java allocations in process 145
    ./uobjnew -l c 2020 1         # grab malloc() sizes and print every second
    ./uobjnew -l ruby 6712 -C 10  # top 10 Ruby types by number of allocations
    ./uobjnew -l ruby 6712 -S 10  # top 10 Ruby types by total size

In the future, it would be nice if these tools could pick a default language if one wasn't specified. Eg, if the process name is python or java, then go with that. There will be times you'll want to override it: examining C calls for a Java program, for example. But most of the time I'm using these tools like this:

# ./ucalls.py -l python `pgrep -n python`

ustat.py, ucalls.py, and uthreads.py want the -l. Should that be dropped for consistency? ... It might be a bit tidier, if these tools could pick the java/python/etc default, to then have the -l as the optional override. In which case, all tools try to pick a default, and all tools have an optional -l.

uflow.py:

  -C CLAZZ, --class CLAZZ

BTW, uflow.py works well, at least for something simple:

# ./uflow.py python `pgrep -n python`
Tracing method calls in python process 26230... Ctrl-C to quit.
CPU PID    TID    TIME(us) METHOD
6   26230  26230  1.846    -> <stdin>.<module>                     
6   26230  26230  1.846      -> <stdin>.func_a                     
6   26230  26230  2.847        -> <stdin>.func_b                   
6   26230  26230  3.848          -> <stdin>.func_c                 
6   26230  26230  4.849          <- <stdin>.func_c                 
6   26230  26230  4.849        <- <stdin>.func_b                   
6   26230  26230  4.849      <- <stdin>.func_a                     
6   26230  26230  4.849    <- <stdin>.<module>

Those PID/TID columns should be handy for filtering multi-threaded apps.

4ast · 2016-11-29T00:33:17Z

@goldshtn are you planning one more update before it can be merged?

goldshtn · 2016-11-29T05:14:17Z

Yes - will fix the immediate issues and rebase, and leave non-pressing stuff for a separate PR. Hope to wrap this up this week.

`ucalls` is a new tool that uses the USDT probes in high- level languages (Java, Python, Ruby) to summarize the number of method calls and optionally their latency. This is similar to `funccount` and `funclatency` but for methods in Java, Python, Ruby, and other languages that might have USDT probes for method entry and exit. `ucalls` uses the `method__entry` and `method__return` probes in Java and Ruby, and the `function__entry` and `function__return` probes in Python. The Python probes are not as accurate because they do not report the class name, but are still useful. The optional `-L` switch indicates whether latency numbers are required. By default, only the number of calls to each method is printed. The `-T` switch filters out the top methods (by number of calls or total latency).

`ugc` monitors GC start and end events in Java, Python, and Ruby, printing out a message when a GC cycle completes along with the duration of the GC and an optional additional description that depends on the target language. NOTE: This WIP does not have any interesting additional descriptions yet. Usage example: ``` ugc java 1448 # monitor Java GCs in process 1448 ```

For runtimes that make it available, we now extract GC event information such as heap sizes and print this information in addition to the GC duration and timestamp.

Skeletons for method flow, object allocations, general event statistics, and thread statistics.

Add Node support for gc__start and gc__done probes, which includes GC type flag parsing.

Add `-S` switch for tracing syscalls with or without high- level language method calls. When `-S` is provided, attach to all `sys_` and `SyS_` entry points in the kernel and trace the number of calls and execution latency (if `-L` was also specified). This is now also a useful general-purpose tool for syscall count/latency summarizing when a language is not specified.

`uthreads` attaches to thread start and stop USDT probes and traces out thread creation and destruction events. Currently, only Java is supported.

Trace `pthread_start` calls and get the symbol for the newly-started thread for printing. Also allow no-Java runs with no language specified for pthread-only tracing.

Use various USDT probes in Java and Ruby, and `malloc()` uprobe in C, to summarize object allocations by type or by size. Print statistics at the end or at intervals, with optional trimming of top types by number of instances allocated or number of bytes.

Allow callers to obtain the list of active (enabled) USDT probes from the USDT object, if they need to enable the uprobes by hand without using the bcc attach support. This is required when the same uprobe needs to be enabled in more than one process.

This tool gathers statistics (by pid) on method calls, class loads, garbage collections, thread creations, object allocations, and exceptions raised and prints them in a top-like style. Supported languages currently include Java, Python, Ruby, and Node. Not all languages provide the necessary probes for all event types. Usage examples: ``` ustat -C ustat -l java 1 ustat -l ruby -S objnew ```

This tool traces method executions and prints out messages with a nesting depth indicator that allows relatively easy visualization of program flow. To reduce overhead, `-C` and `-M` switches take class and method prefixes to filter the output by. The filtering takes place in the BPF program, so the unnecessary data never ends up in user space. Usage examples: ``` uflow java 185 uflow -M index python 180 uflow -C 'IRB::Irb' ruby 10041 ```

The /proc/PID/cmdline file has \0 for spaces in the command line, and there may be trailing \0 characters as well. Replace them all with spaces.

If the process in question doesn't have the USDT probes we expect, handle the failure gracefully. After all, it could be a binary compiled without USDT support, the pid could be recycled to some other application, or -- the process name could match, but by accident, and really belong to some other runtime.

When C allocations are traced, attach the `malloc` uprobe only to the requested process -- otherwise, we get allocation information from the entire system.

goldshtn · 2016-12-19T10:01:29Z

@4ast @brendangregg

Rebased and fixed the issues found above -- I think this is ready for merging. I will open a separate issue for automatically detecting the language, I don't know if we always want it. In any case I think it belongs in a separate PR.

Re the -l inconsistency in some tools: the reason the -l is required in ustat, ucalls, and uthreads is that it is an optional argument. If it had been positional and optional, it couldn't have been first in the list, and I think that would be kind of ugly, e.g.: ustat 4482 1 java reads a bit weird.

Let's get this merged (I also have a blog post I need to write about these :)) and then iterate?

ceeaspb · 2016-12-19T14:15:57Z

great stuff
@goldshtn likely a separate ticket . I tested with a jvm where usdts are not built in (eg. Oracle). An error message is provided about not specifying a pid. I did specify a pid so this could be confusing. In this set of scripts it would be useful to indicate to the user that the usdts (uprobes) were not found in the process likely because the binary was not built with it enabled (as it's mentioned in the man pages etc)

goldshtn force-pushed the usdt-tools branch 2 times, most recently from c681cdc to 1207f28 Compare October 27, 2016 14:21

goldshtn force-pushed the usdt-tools branch from b5fc267 to 382ec80 Compare October 29, 2016 17:11

goldshtn force-pushed the usdt-tools branch from 382ec80 to 542888c Compare November 7, 2016 22:42

goldshtn changed the title ~~WIP [do not merge]: USDT tools for higher level languages~~ USDT tools for higher level languages Nov 7, 2016

dark mentioned this pull request Nov 9, 2016

cc, python: Clean up BPF module and tables #808

Merged

goldshtn force-pushed the usdt-tools branch from 542888c to 397ea4c Compare November 9, 2016 20:09

goldshtn added 8 commits December 19, 2016 09:46

ugc: Add description messages for GC events

ddef09a

For runtimes that make it available, we now extract GC event information such as heap sizes and print this information in addition to the GC duration and timestamp.

tools: Script skeletons

c0217a9

Skeletons for method flow, object allocations, general event statistics, and thread statistics.

ugc: Node support

8853347

Add Node support for gc__start and gc__done probes, which includes GC type flag parsing.

uthreads: Trace thread creation/destruction events

989057d

`uthreads` attaches to thread start and stop USDT probes and traces out thread creation and destruction events. Currently, only Java is supported.

uthreads: Add pthread support

bf0cf12

Trace `pthread_start` calls and get the symbol for the newly-started thread for printing. Also allow no-Java runs with no language specified for pthread-only tracing.

goldshtn added 21 commits December 19, 2016 09:46

ugc: Fix order of arguments in usage comment

4cebb42

ucalls: TODO reminder to probe C methods

6e5c621

python: Do not call USDT.get_text() twice

dfab8f5

tools: Add block comments to u* scripts

bee71b2

ustat: Sort by pid if no other sort column provided

9f6d03b

ustat: Change 'count' argument type to int

087dd73

ustat: Replace \0 with spaces in cmdline

d8c7f47

The /proc/PID/cmdline file has \0 for spaces in the command line, and there may be trailing \0 characters as well. Replace them all with spaces.

ustat: Add man page and companion example file

5a9b154

ucalls: Add man page and companion examples file

070c1df

ugc: Add man page and companion examples file

ea77193

uobjnew: Add man page and companion examples file

7160f8a

uthreads: Add man page and companion examples file

c0a6352

uflow: Add man page and companion examples file

af05e5a

README: Add u* tools based on USDT probes

0a5ea58

uobjnew: Attach uprobe only to the requested process

39ace6f

When C allocations are traced, attach the `malloc` uprobe only to the requested process -- otherwise, we get allocation information from the entire system.

uobjnew: Remove -l switch from examples

d4a4248

goldshtn force-pushed the usdt-tools branch from 397ea4c to d4a4248 Compare December 19, 2016 09:58

4ast merged commit b2fe72f into iovisor:master Dec 19, 2016

vchuravy mentioned this pull request Apr 4, 2019

WIP: instrument julia to get per-method-instance llvm optimization timings JuliaLang/julia#31616

Closed

Conversation

goldshtn commented Oct 20, 2016

Uh oh!

brendangregg commented Oct 20, 2016

Uh oh!

goldshtn commented Oct 21, 2016

Uh oh!

brendangregg commented Oct 21, 2016

Uh oh!

goldshtn commented Oct 26, 2016

Uh oh!

brendangregg commented Oct 26, 2016

Uh oh!

goldshtn commented Oct 26, 2016

Uh oh!

goldshtn commented Oct 27, 2016

Uh oh!

goldshtn commented Nov 7, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dark commented Nov 8, 2016

Uh oh!

goldshtn commented Nov 9, 2016

Uh oh!

goldshtn commented Nov 9, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

goldshtn commented Nov 9, 2016

Uh oh!

brendangregg commented Nov 18, 2016

Uh oh!

4ast commented Nov 29, 2016

Uh oh!

goldshtn commented Nov 29, 2016

Uh oh!

goldshtn commented Dec 19, 2016

Uh oh!

ceeaspb commented Dec 19, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

goldshtn commented Nov 7, 2016 •

edited

Loading

goldshtn commented Nov 9, 2016 •

edited

Loading