Introduce {attach|detach}_kfunc API by olsajiri · Pull Request #2726 · iovisor/bcc

olsajiri · 2020-01-30T15:22:36Z

Kernel added new probe called trampoline allowing to probe
almost any kernel function when BTF info is available in
the system.

Adding the interface under 'kfunc' name, which allows to
probe function entry and 'kretfunc' name, which allows to
trace function return.

The main benefit is really low overhead which is really low
for trampolines (please see commit for klockstat.py
with perf comparison).

Another benefit is the ability of kretfunc probe to access
function arguments, so some tools might need only one program
instead of entry/exit ones (please see commit for
opensnoop.py changes).

yonghong-song · 2020-01-30T16:37:37Z

[buildbot, ok to test]

yonghong-song · 2020-01-30T16:37:47Z

[buildbot, test this please]

yonghong-song · 2020-01-31T06:45:43Z

All tests failed. All the buildbot is pretty old. The latest one in buildbot is fc28 which I do not think supports BTF. For example, the following is the error message:

29: ....../bin/sh: line 1: 11657 Killed                  timeout -s KILL -k 5s 5s ../../tools/killsnoop.py > /dev/null
29: .s........WARNING: 15 stack traces lost and could not be displayed.
29: ..libbpf: failed to find valid kernel BTF
29: libbpf: vmlinux BTF is not found
29: /virtual/main.c:27:1: error: redefinition of 'infotmp_table_t'
29: BPF_HASH(infotmp, u64, struct val_t);
29: ^
29: /virtual/include/bcc/helpers.h:176:37: note: expanded from macro 'BPF_HASH'
29:   BPF_HASHX(__VA_ARGS__, BPF_HASH4, BPF_HASH3, BPF_HASH2, BPF_HASH1)(__VA_ARGS__)
29:                                     ^
29: /virtual/main.c:24:1: note: previous definition is here
29: BPF_HASH(infotmp, u64, struct val_t);
29: ^
29: /virtual/include/bcc/helpers.h:176:37: note: expanded from macro 'BPF_HASH'
29:   BPF_HASHX(__VA_ARGS__, BPF_HASH4, BPF_HASH3, BPF_HASH2, BPF_HASH1)(__VA_ARGS__)
29:                                     ^
29: /virtual/main.c:27:10: error: redefinition of 'infotmp'
29: BPF_HASH(infotmp, u64, struct val_t);
29:          ^
29: /virtual/main.c:24:10: note: previous definition is here
29: BPF_HASH(infotmp, u64, struct val_t);
29:          ^
29: /virtual/main.c:27:1: error: redefinition of '____btf_map_infotmp'
29: BPF_HASH(infotmp, u64, struct val_t);
29: ^
29: /virtual/include/bcc/helpers.h:176:37: note: expanded from macro 'BPF_HASH'
29:   BPF_HASHX(__VA_ARGS__, BPF_HASH4, BPF_HASH3, BPF_HASH2, BPF_HASH1)(__VA_ARGS__)
29:                                     ^
29: /virtual/main.c:24:1: note: previous definition is here
29: BPF_HASH(infotmp, u64, struct val_t);
29: ^
29: /virtual/include/bcc/helpers.h:176:37: note: expanded from macro 'BPF_HASH'
29:   BPF_HASHX(__VA_ARGS__, BPF_HASH4, BPF_HASH3, BPF_HASH2, BPF_HASH1)(__VA_ARGS__)
29:                                     ^
29: /virtual/main.c:27:1: error: redefinition of '____btf_map_infotmp'
29: BPF_HASH(infotmp, u64, struct val_t);
29: ^
29: /virtual/include/bcc/helpers.h:176:37: note: expanded from macro 'BPF_HASH'
29:   BPF_HASHX(__VA_ARGS__, BPF_HASH4, BPF_HASH3, BPF_HASH2, BPF_HASH1)(__VA_ARGS__)
29:                                     ^
29: /virtual/main.c:24:1: note: previous definition is here
29: BPF_HASH(infotmp, u64, struct val_t);
29: ^
29: /virtual/include/bcc/helpers.h:176:37: note: expanded from macro 'BPF_HASH'
29:   BPF_HASHX(__VA_ARGS__, BPF_HASH4, BPF_HASH3, BPF_HASH2, BPF_HASH1)(__VA_ARGS__)
29:                                     ^
29: 4 errors generated.

We will need to guard the new test with kernel version and availability of kernel BTF. Not all distro's have kernel BTF, I guess.

olsajiri · 2020-01-31T07:54:52Z

ok, I expected some failures, but not that bad.. will check on some BTF detection for tests

yonghong-song · 2020-01-31T17:35:14Z

tools/opensnoop.py

+"""
+
+bpf_text_kprobe = """
+BPF_HASH(infotmp, u64, struct val_t);


This BPF_HASH is redundant. This will cause compilation error for kprobe case.

ugh, I must have consused my versions and pushed the wrong one,
I recall fixing this, sry.. will fix in next version

yonghong-song · 2020-02-05T23:53:27Z

@olsajiri any update on this one?

brendangregg · 2020-02-06T00:41:28Z

Having the kretfunc retval in args[N] might get pretty confusing. Could it have a separate variable called retval for this?

(I was going to suggest putting retval in args[0] and bumping the others along by one, but that would make it inconsistent with kfunc.)

yonghong-song · 2020-02-06T01:41:13Z

Having the kretfunc retval in args[N] might get pretty confusing. Could it have a separate variable called retval for this?
I think we could design an interface with return value as a separate variable. The rewriter can kick in to generate proper code. We also need and we already have BTF to get the function signature, so we could parse and get the number of arguments for the particular function.

olsajiri · 2020-02-06T18:51:29Z

Having the kretfunc retval in args[N] might get pretty confusing. Could it have a separate variable called retval for this?
I think we could design an interface with return value as a separate variable. The rewriter can kick in to generate proper code. We also need and we already have BTF to get the function signature, so we could parse and get the number of arguments for the particular function.

sry, I'm out this week, hence my lag.. I think you're right, we can add some better interface
for arguments.. perhaps even with the argument names, because we have it in the BTF
I'll check on this

jirka

yonghong-song · 2020-02-06T19:39:26Z

Oh we could adopt libbpf convention as well.

__u64 test4_result = 0;
SEC("fexit/bpf_fentry_test4")
int BPF_PROG(test4, void *a, char b, int c, __u64 d, int ret)

Some macros, so we do not need to add BTF analysis code although it is nice to have to double check.

In the long time, I intend to change bcc to be more libbpf friendly so we can get a lot of libbpf funcitonality for free.

olsajiri · 2020-02-10T14:50:02Z

Oh we could adopt libbpf convention as well.
__u64 test4_result = 0;
SEC("fexit/bpf_fentry_test4")
int BPF_PROG(test4, void *a, char b, int c, __u64 d, int ret)
Some macros, so we do not need to add BTF analysis code although it is nice to have to double check.

In the long time, I intend to change bcc to be more libbpf friendly so we can get a lot of libbpf funcitonality for free.

ok, that might be actually easier.. will check

thanks,
jirka

olsajiri · 2020-02-25T11:51:20Z

@yonghong-song actually while working on bpftrace support,
I relaized we don't need to pass expected_attach_type through
bcc_func_load chain, but we can get it from the function name
prefix (kfunc__, kretfunc__).

If this kind of interface is ok with you - getting the attr info from
the function name, I'll make the changes. It will ease up bpftrace
inclusion, because we could continue to use bcc_prog_load.

We need to use LIBBPF_LIBRARIES variable instead of just using bpf for libbpf linking, so it user specified library gets linked in. Signed-off-by: Jiri Olsa <jolsa@kernel.org>

Currently we include headers from local libbpf subpackage, which does not work if user specify LIBBPF_INCLUDE_DIR. Adding HAVE_EXTERNAL_LIBBPF macro, that gets defined when user specifies libbpf header path. This macro is checked in bcc_libbpf_inc.h and proper files are included. Using bcc_libbpf_inc.h in place of libbpf includes. Signed-off-by: Jiri Olsa <jolsa@kernel.org>

The TRACING/EXT programs use attach_btf_id and attach_prog_fd fields from struct bpf_load_program_attr. The attach_prog_fd field shares space with kern_version, so by setting kern_version unconditionally we also set attach_prog_fd to bogus value and kernel fails the load because it tries to look it up. Setting kern_version only for programs other than TRACING/EXT type. Signed-off-by: Jiri Olsa <jolsa@kernel.org>

Kernel added new probe called trampoline allowing to probe almost any kernel function when BTF info is available in the system. Adding the interface to define trampoline function for given kernel function via BPF_PROG macro, like: KFUNC_PROBE(do_sys_open, int dfd, const char *filename, int flags, int mode) { ... } which defines trampoline function with the 'kfunc__do_sys_open' name, that instruments do_sys_open kernel function before the function is executed. or: KRETFUNC_PROBE(do_sys_open, int dfd, const char *filename, int flags, int mode, int ret) { ... } which defines trampoline function with the 'kfunc__do_sys_open' name, that instruments do_sys_open kernel function after the function is executed. The main benefit is really lower overhead for trampolines (please see following commit for klockstat.py with perf comparison). Another benefit is the ability of kretfunc probe to access function arguments, so some tools might need only one program instead of entry/exit ones (please see following commit for opensnoop.py changes). Currently the interface does not allow to define function of different name than: kfunc__<function_name> or kretfunc__<function_name> which is sufficient by now, and can be easily changed in future if needed. Signed-off-by: Jiri Olsa <jolsa@kernel.org>

Adding support_kfunc function to BPF object to return state of kfunc trampolines support. Signed-off-by: Jiri Olsa <jolsa@kernel.org>

Adding kfunc return trampoline probe if available instead of kprobe/kretprobe probes. The return trampoline has access to function entry arguments, so we are good with just single eBPF program. The kfunc trampolines are also faster - less intrusive. Below are stats for compiling linux kernel while running opensnoop.py on the background for kprobes and kfuncs. Without opensnoop.py: Performance counter stats for 'system wide': 849,741,782,765 cycles:k 162.235646336 seconds time elapsed With opensnoop.py using kprobes: Performance counter stats for 'system wide': 941,615,199,769 cycles:k 164.355032879 seconds time elapsed With opensnoop.py using trampolines: Performance counter stats for 'system wide': 913,437,005,488 cycles:k 163.742914795 seconds time elapsed Signed-off-by: Jiri Olsa <jolsa@kernel.org>

Adding kfunc return trampoline probe if available instead of kprobe/kretprobe probes. The kfunc trampolines are faster - less intrusive. The benchmark without klockstat.py script on background: $ perf bench sched messaging -l 50000 # Running 'sched/messaging' benchmark: # 20 sender and receiver processes per group # 10 groups == 400 processes run Total time: 18.571 [sec] With kprobe tracing: $ perf bench sched messaging -l 50000 # Running 'sched/messaging' benchmark: # 20 sender and receiver processes per group # 10 groups == 400 processes run Total time: 183.395 [sec] With kfunc tracing: $ perf bench sched messaging -l 50000 # Running 'sched/messaging' benchmark: # 20 sender and receiver processes per group # 10 groups == 400 processes run Total time: 39.773 [sec] Signed-off-by: Jiri Olsa <jolsa@kernel.org>

Adding kfunc return trampoline probe if available instead of kprobe/kretprobe probes. The kfunc trampolines are faster - less intrusive. Below are stats for running perf bench sched pipe benchamark while running vfsstat.py on the background for kprobes and kfuncs. With kprobes: Performance counter stats for './perf bench sched pipe -l 5000000' (3 runs): 112,520,853,574 cycles:k 48.674 +- 0.672 seconds time elapsed ( +- 1.38% ) With kfuncs: Performance counter stats for './perf bench sched pipe -l 5000000' (3 runs): 106,304,165,424 cycles:k 46.820 +- 0.197 seconds time elapsed ( +- 0.42% ) Signed-off-by: Jiri Olsa <jolsa@kernel.org>

yonghong-song · 2020-02-25T23:43:35Z

@yonghong-song actually while working on bpftrace support,
I relaized we don't need to pass expected_attach_type through
bcc_func_load chain, but we can get it from the function name
prefix (kfunc__, kretfunc__).

If this kind of interface is ok with you - getting the attr info from
the function name, I'll make the changes. It will ease up bpftrace
inclusion, because we could continue to use bcc_prog_load.

This should work. In bcc, we already have some other usage with special prefixes.

olsajiri · 2020-02-26T11:36:22Z

@yonghong-song actually while working on bpftrace support,
I relaized we don't need to pass expected_attach_type through
bcc_func_load chain, but we can get it from the function name
prefix (kfunc__, kretfunc__).
If this kind of interface is ok with you - getting the attr info from
the function name, I'll make the changes. It will ease up bpftrace
inclusion, because we could continue to use bcc_prog_load.

This should work. In bcc, we already have some other usage with special prefixes.

cool, changes are already in the branch

yonghong-song · 2020-02-27T05:47:49Z

LGTM. Thanks!

alban · 2020-03-18T17:32:25Z

I get the following error in opensnoop with Linux 5.5.6. Is it a problem in bcc or in the kernel?

bpf: Failed to load program: Permission denied
0: (bf) r6 = r1
1: (79) r1 = *(u64 *)(r6 +32)
2: (7b) *(u64 *)(r10 -304) = r1
3: (79) r7 = *(u64 *)(r6 +8)
func 'do_sys_open' arg1 type INT is not a struct
invalid bpf_context access off=8 size=8
processed 4 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
Traceback (most recent call last):
  File "/usr/share/bcc/tools/opensnoop", line 235, in <module>
    
b = BPF(text=bpf_text)
  File "/usr/lib/python2.7/dist-packages/bcc/__init__.py", line 364, in __init__
    
self._trace_autoload()
  File "/usr/lib/python2.7/dist-packages/bcc/__init__.py", line 1198, in _trace_autoload
    
self.attach_kretfunc(fn_name=func_name)
  File "/usr/lib/python2.7/dist-packages/bcc/__init__.py", line 933, in attach_kretfunc
    
fn = self.load_func(fn_name, BPF.TRACING)
  File "/usr/lib/python2.7/dist-packages/bcc/__init__.py", line 404, in load_func
    
(func_name, errstr))
Exception
: 
Failed to load BPF program kretfunc__do_sys_open: Permission denied

yonghong-song · 2020-03-18T21:39:33Z

Maybe this is a kernel issue? I mean kernel is not advanced enough to have this patch?
https://www.spinics.net/lists/netdev/msg627180.html

alban · 2020-03-19T10:11:46Z

Thanks for the pointer. Indeed my kernel does not have this patch. I wonder if BPF.support_kfunc() should return false on kernels without that patch, so that opensnoop can fallback on the previous kprobe code, or if BCC is fine asking users to update their kernels to the latest version.

I just noticed that the new kfunc code does not have the --cgroupmap filter.
Edit: I filed #2826 for that.

olsajiri requested review from 4ast, brendangregg, drzaeus77, goldshtn and yonghong-song as code owners January 30, 2020 15:22

yonghong-song reviewed Jan 31, 2020

View reviewed changes

olsajiri force-pushed the kfunc branch from c481551 to cfaa44e Compare February 22, 2020 00:17

olsajiri added 8 commits February 25, 2020 16:07

Fix bpf linking if specified via LIBBPF_LIBRARIES variable

726d1f3

We need to use LIBBPF_LIBRARIES variable instead of just using bpf for libbpf linking, so it user specified library gets linked in. Signed-off-by: Jiri Olsa <jolsa@kernel.org>

Add support_kfunc function to BPF object

f16dd9a

Adding support_kfunc function to BPF object to return state of kfunc trampolines support. Signed-off-by: Jiri Olsa <jolsa@kernel.org>

olsajiri force-pushed the kfunc branch from cfaa44e to ea845c0 Compare February 25, 2020 15:12

yonghong-song merged commit 2fa54c0 into iovisor:master Feb 27, 2020

alban mentioned this pull request Mar 19, 2020

opensnoop: fix --cgroupmap with kfunc #2826

Merged

alban mentioned this pull request Apr 22, 2020

Run automatic tests for BPF before each Flatcar release flatcar/Flatcar#89

Closed

mauriciovasquezbernal mentioned this pull request May 21, 2020

Fix KFUNC_PROBE return value #2924

Merged

anakryiko mentioned this pull request Jun 15, 2020

Get function params on return #2959

Closed

Conversation

olsajiri commented Jan 30, 2020

Uh oh!

yonghong-song commented Jan 30, 2020

Uh oh!

yonghong-song commented Jan 30, 2020

Uh oh!

yonghong-song commented Jan 31, 2020

Uh oh!

olsajiri commented Jan 31, 2020

Uh oh!

yonghong-song Jan 31, 2020

Choose a reason for hiding this comment

Uh oh!

olsajiri Feb 6, 2020

Choose a reason for hiding this comment

Uh oh!

yonghong-song commented Feb 5, 2020

Uh oh!

brendangregg commented Feb 6, 2020

Uh oh!

yonghong-song commented Feb 6, 2020

Uh oh!

olsajiri commented Feb 6, 2020

Uh oh!

yonghong-song commented Feb 6, 2020

Uh oh!

olsajiri commented Feb 10, 2020

Uh oh!

olsajiri commented Feb 25, 2020

Uh oh!

yonghong-song commented Feb 25, 2020

Uh oh!

olsajiri commented Feb 26, 2020

Uh oh!

yonghong-song commented Feb 27, 2020

Uh oh!

alban commented Mar 18, 2020

Uh oh!

yonghong-song commented Mar 18, 2020

Uh oh!

alban commented Mar 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

alban commented Mar 19, 2020 •

edited

Loading