Skip to content

Conversation

@jameshilliard
Copy link
Contributor

This should hopefully fix cross compilation for the bpf programs.

@jameshilliard jameshilliard force-pushed the bpf branch 2 times, most recently from f72f400 to 4613dc0 Compare August 12, 2021 07:38
@bluca bluca added bpf meson reviewed/needs-rework 🔨 PR has been reviewed and needs another round of reworks labels Aug 12, 2021
@jameshilliard jameshilliard force-pushed the bpf branch 2 times, most recently from 791c04f to d05a190 Compare August 12, 2021 09:22
/* libbpf is used via dlopen(), so rename symbols */
#define bpf_object__open_skeleton sym_bpf_object__open_skeleton
#define bpf_object__load_skeleton sym_bpf_object__load_skeleton
#define bpf_object__destroy_skeleton sym_bpf_object__destroy_skeleton
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think this approach should work?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure I follow, this was not necessary with the previous hook, why is it necessary now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously this was being done manually with a regex in the python script, however I didn't see a clean way to do that with meson directly. This seems to work though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these fixed? or can they change dynamically?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed I think, regex would likely break as well if they get changed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant, the regex catched any of these if new ones were added, while this is hard-coded - will there be new skeleton functions added? Is that possible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think manual changes would typically also end up being required in a few places either way.

@bluca bluca removed the reviewed/needs-rework 🔨 PR has been reviewed and needs another round of reworks label Aug 12, 2021
@poettering
Copy link
Member

/cc @keszybz @wat-ze-hex

@bluca bluca added the ci-fails/needs-rework 🔥 Please rework this, the CI noticed an issue with the PR label Aug 12, 2021
@jameshilliard jameshilliard force-pushed the bpf branch 4 times, most recently from e7ebde0 to c050465 Compare August 13, 2021 01:47
@jkartseva
Copy link

jkartseva commented Aug 13, 2021

@jameshilliard

IMO having bloated bpf/socket_bind/meson.build with three rules instead of one (clang compilation, llvm strip and skeleton generation via bpftool) is exposing too much implementation detail which makes adding new features to systemd inconvenient and scales poorly. As a feature developer, I would rather prefer not to know which options should be passed to clang, why llvm strip is needed, etc.

E.g. there are two other features pending #18145 and #18385. If this PR gets merged, this would mean dull nearly-copy-pasting of bpf/socket_bind/meson.build and all its unpleasantries. cc @iaguis @mauriciovasquezbernal

The goal of tools/build-bpf-skel.py you are removing is the opposite: to encapsulate building details and to make the overall process more friendly to feature developers.

FWIW, @poettering brought up integration with meson in the original PR introduced both libbpf and build script: #17655 (comment)
I.e. it could be bpf_skeleton rule natively supported by meson. Still it needs to be implemented and merged. I can find cycles to work on that if I understand the problem better.

Hence could you please provide a better description of the problem in this PR, similar to a standard template used for filing an issue (target platform, distribution, error message, etc).
It's not clear what exactly is broken from one-line description. Thanks!

@jameshilliard
Copy link
Contributor Author

IMO having bloated bpf/socket_bind/meson.build with three rules instead of one (clang compilation, llvm strip and bpftool) is exposing too much implementation detail which makes adding new features to systemd inconvenient and scales poorly.

This looks less bloated IMO since it eliminated the python script that is attempting to do the same thing.

As a feature developer, I would rather prefer not to know which options should be passed to clang, why llvm strip is needed, etc.

Yeah, that would be ideal, but at least this should be cleaner and more reliable.

The goal of bpf/socket_bind/meson.build you are removing is the opposite: to encapsulate building details and to make the overall process more friendly to feature developers.

I assume you are referring to tools/build-bpf-skel.py, the main problem is that it doesn't work right since it doesn't set up the compiler include flags correctly, it just hard codes some of the more common ones.

FWIW, @poettering brought up integration with meson in the original PR introduced both libbpf and build script: #17655 (comment)
I.e. it could be bpf_skeleton rule natively supported by meson. Still it needs to be implemented and merged. I can find cycles to work on that if I understand the problem better.

I did file an issue with meson. If we need something reuseable we can probably use a generator. I try to avoid external scripts as those tend to be difficult when it comes to making cross compilation work correctly.

Hence could you please provide a better description of the problem in this PR, similar to a standard template used for filing an issue (target platform, distribution, error message, etc).

I mean, it's super broken in any cross compilation scenario, for example:

>>> systemd custom Building
PATH="/home/buildroot/buildroot/output/per-package/systemd/host/bin:/home/buildroot/buildroot/output/per-package/systemd/host/sbin:/home/buildroot/bin:/home/buildroot/.local/bin:/home/buildroot/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin" LC_ALL=en_US.UTF-8 PYTHONNOUSERSITE=y /home/buildroot/buildroot/output/per-package/systemd/host/bin/ninja  -j33  -C /home/buildroot/buildroot/output/build/systemd-custom//build
ninja: Entering directory `/home/buildroot/buildroot/output/build/systemd-custom//build'
[14/953] Generating socket-bind.skel.h with a custom command
FAILED: src/core/bpf/socket_bind/socket-bind.skel.h 
/home/buildroot/buildroot/output/build/systemd-custom/tools/build-bpf-skel.py --clang_exec /usr/bin/clang --llvm_strip_exec /usr/bin/llvm-strip --bpftool_exec /home/buildroot/buildroot/output/per-package/systemd/host/sbin/bpftool --arch aarch64 ../src/core/bpf/socket_bind/socket-bind.bpf.c src/core/bpf/socket_bind/socket-bind.skel.h
../src/core/bpf/socket_bind/socket-bind.bpf.c:14:10: fatal error: 'bpf/bpf_endian.h' file not found
#include <bpf/bpf_endian.h>
         ^~~~~~~~~~~~~~~~~~
1 error generated.
Traceback (most recent call last):
  File "/home/buildroot/buildroot/output/build/systemd-custom/tools/build-bpf-skel.py", line 128, in <module>
    bpf_build(args)
  File "/home/buildroot/buildroot/output/build/systemd-custom/tools/build-bpf-skel.py", line 83, in bpf_build
    clang_compile(clang_exec=args.clang_exec,
  File "/home/buildroot/buildroot/output/build/systemd-custom/tools/build-bpf-skel.py", line 46, in clang_compile
    subprocess.check_call(clang_args)
  File "/home/buildroot/buildroot/output/per-package/systemd/host/lib/python3.9/subprocess.py", line 373, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/usr/bin/clang', '-Wno-compare-distinct-pointer-types', '-O2', '-target', 'bpf', '-g', '-c', '-D__aarch64__', '-I.', '-isystem', '/usr/include/x86_64-linux-gnu', '-idirafter', '/usr/local/include', '-idirafter', '/usr/include', '../src/core/bpf/socket_bind/socket-bind.bpf.c', '-o', '../src/core/bpf/socket_bind/socket-bind.bpf.o']' returned non-zero exit status 1.
``

@jkartseva
Copy link

jkartseva commented Aug 13, 2021

@jameshilliard

This looks less bloated IMO since it eliminated the python script that is attempting to do the same thing.

The script is reusable for other contributors, that was the point. When #18145 and #18385 are merged further modifying (e.g. changing compilation flags) would mean modifying in three places rather than in one, and this will only get worse when more features are added.
Please, let's keep the script until meson rule is available, or maintaining replicated meson.build files will be a nightmare.

I assume you are referring to tools/build-bpf-skel.py, the main problem is that it doesn't work right since it doesn't set up the compiler include flags correctly, it just hard codes some of the more common ones.

I agree, cross-compilation must be addressed, thanks for raising an issue. I didn't pay a proper attention to it in #17655 since it was huge enough.

I did file an issue with meson. If we need something reuseable we can probably use a generator. I try to avoid external scripts as those tend to be difficult when it comes to making cross compilation work correctly.

I mean, it's super broken in any cross compilation scenario, for example:

>>> systemd custom Building
PATH="/home/buildroot/buildroot/output/per-package/systemd/host/bin:/home/buildroot/buildroot/output/per-package/systemd/host/sbin:/home/buildroot/bin:/home/buildroot/.local/bin:/home/buildroot/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin" LC_ALL=en_US.UTF-8 PYTHONNOUSERSITE=y /home/buildroot/buildroot/output/per-package/systemd/host/bin/ninja  -j33  -C /home/buildroot/buildroot/output/build/systemd-custom//build
ninja: Entering directory `/home/buildroot/buildroot/output/build/systemd-custom//build'
[14/953] Generating socket-bind.skel.h with a custom command
FAILED: src/core/bpf/socket_bind/socket-bind.skel.h 
/home/buildroot/buildroot/output/build/systemd-custom/tools/build-bpf-skel.py --clang_exec /usr/bin/clang --llvm_strip_exec /usr/bin/llvm-strip --bpftool_exec /home/buildroot/buildroot/output/per-package/systemd/host/sbin/bpftool --arch aarch64 ../src/core/bpf/socket_bind/socket-bind.bpf.c src/core/bpf/socket_bind/socket-bind.skel.h
../src/core/bpf/socket_bind/socket-bind.bpf.c:14:10: fatal error: 'bpf/bpf_endian.h' file not found
#include <bpf/bpf_endian.h>
         ^~~~~~~~~~~~~~~~~~
1 error generated.
Traceback (most recent call last):
  File "/home/buildroot/buildroot/output/build/systemd-custom/tools/build-bpf-skel.py", line 128, in <module>
    bpf_build(args)
  File "/home/buildroot/buildroot/output/build/systemd-custom/tools/build-bpf-skel.py", line 83, in bpf_build
    clang_compile(clang_exec=args.clang_exec,
  File "/home/buildroot/buildroot/output/build/systemd-custom/tools/build-bpf-skel.py", line 46, in clang_compile
    subprocess.check_call(clang_args)
  File "/home/buildroot/buildroot/output/per-package/systemd/host/lib/python3.9/subprocess.py", line 373, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/usr/bin/clang', '-Wno-compare-distinct-pointer-types', '-O2', '-target', 'bpf', '-g', '-c', '-D__aarch64__', '-I.', '-isystem', '/usr/include/x86_64-linux-gnu', '-idirafter', '/usr/local/include', '-idirafter', '/usr/include', '../src/core/bpf/socket_bind/socket-bind.bpf.c', '-o', '../src/core/bpf/socket_bind/socket-bind.bpf.o']' returned non-zero exit status 1.
``

SG, will take a close look tomorrow (It's past 11PM atm).

@jameshilliard
Copy link
Contributor Author

Please, let's keep the script until meson rule is available, or maintaining replicated meson.build files will be a nightmare.

The script is kinda a maintainability/reliability nightmare IMO, I'll see if I can make this work as a reusable generator.

@jameshilliard
Copy link
Contributor Author

I'll see if I can make this work as a reusable generator.

I experimented there and it seems the generator feature is too broken to be usable for this. We can at least centralize the clang command params however to reduce code duplication a bit, should at least be more reliable than using external scripts.

@bluca bluca removed the ci-fails/needs-rework 🔥 Please rework this, the CI noticed an issue with the PR label Aug 13, 2021
#include "bpf-dlopen.h"
#include "bpf-link.h"
#include "bpf/socket_bind/socket-bind.skel.h"
#include "bpf/socket_bind/socket-bind-skel.h"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please stick to .bpf.c and .skel.h conventions, that are used pretty universally across many BPF-based projects

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bpftool generated socket-bind.skel.h file exists still with the same name as before, it just gets included by socket-bind-skel.h(which redefines the libbpf symbols to avoid linker issues). I can't rename it to socket-bind.skel.h as the file with that name already exists. Should I rename it to something different?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to socket-bind-skel-wrapper.h.

@anakryiko
Copy link

@wat-ze-hex note that with recent enough bpftool (we can lookup exact minimal version) BPF static linking is supported, which is a good feature in itself (allows to break up BPF files into smaller, potentially re-usable .bpf.c files), but it also make llvm-strip unnecessary (bpftool gen object command strips away all DWARF info anyways). So if systemd is fine with bumping whatever minimal bpftool version dependency they have, you might simplify build a tiny bit, but also will get static linking for BPF sources.

@jameshilliard
Copy link
Contributor Author

note that with recent enough bpftool (we can lookup exact minimal version) BPF static linking is supported, which is a good feature in itself (allows to break up BPF files into smaller, potentially re-usable .bpf.c files), but it also make llvm-strip unnecessary (bpftool gen object command strips away all DWARF info anyways).

Does this work when cross compiling?

@jameshilliard
Copy link
Contributor Author

Re-based with #18385 build handling as well.

@poettering
Copy link
Member

I can't really comment on this. Our resident Python/Meson guru is @keszybz, so I'll defer to him.

@jameshilliard jameshilliard force-pushed the bpf branch 5 times, most recently from e1c6854 to e1a4d3d Compare December 4, 2021 07:04
This should hopefully fix cross compilation for the bpf programs.
@jameshilliard
Copy link
Contributor Author

Rebased again, @keszybz did you get a chance to look at this?

Copy link
Member

@bluca bluca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks superficially fine to me, but needs another pair of eyes

@bluca bluca requested a review from keszybz December 4, 2021 11:07
@keszybz
Copy link
Member

keszybz commented Dec 7, 2021

IMO having bloated bpf/socket_bind/meson.build with three rules instead of one (clang compilation, llvm strip and skeleton generation via bpftool) is exposing too much implementation detail which makes adding new features to systemd inconvenient and scales poorly. As a feature developer, I would rather prefer not to know which options should be passed to clang, why llvm strip is needed, etc.

To some extent this is true. But it is better for us to have more stuff implemented directly in meson. Then we don't need to care about dependency handling, or debugging (-v is implemented for us), etc. And for developers it doesn't really matter if this complexity is in a .py helper or directly in meson, since this is all the same project. And I would say that the complexity added in this PR is not too big. There is some repetitions, but that's how meson is.

As I was looking at this PR, I actually noticed a minor issue with the previous implementation. The tool wrote the output files in $SOURCEDIR instead of $BUILDDIR. We never noticed because we still had *.o in .gitignore. By letting meson handle this for us, we avoid that without any effort.

OTOH, I'm not sure if the patch handles all dependencies correctly. When I do touch src/core/bpf/restrict_fs/restrict-fs.bpf.c && ninja -C build, it doesn't seem to rebuild enough stuff.

@keszybz
Copy link
Member

keszybz commented Dec 7, 2021

There seems to be a bug in meson. I'll report it separately. I think it doesn't matter too much, most people don't modify the .bpf.c files, and with a build from scratch this will not matter.

@keszybz keszybz merged commit d40ce01 into systemd:main Dec 7, 2021
@keszybz
Copy link
Member

keszybz commented Dec 7, 2021

It seems that there is no bug. Seems like I messed something up during testing.

$ touch src/core/bpf/restrict_fs/restrict-fs.bpf.c
$ ninja -C build
this rebuilds 4 targets:
src/core/bpf/restrict_fs/restrict-fs.bpf.unstripped.o
src/core/bpf/restrict_fs/restrict-fs.bpf.o
src/core/bpf/restrict_fs/restrict-fs.skel.h
version.h

but this rebuilds 114 targets:

$ sed -r -i 's/cgroup_id =/\0 1 +/' src/core/bpf/restrict_fs/restrict-fs.bpf.c
$ ninja -C build

So it seems meson/ninja are able to correctly shortcircuit the build when there are no changes.

Before this PR:

$ touch src/core/bpf/restrict_fs/restrict-fs.bpf.c
$ ninja -C build -v
34 targets
$ sed -r -i 's/cgroup_id =/\0 1 +/' src/core/bpf/restrict_fs/restrict-fs.bpf.c
$ ninja -C build
122 targets

The new scheme seems to handle dependencies better.

keszybz added a commit to keszybz/systemd that referenced this pull request Dec 8, 2021
Those made sense when autotoolz were used. With meson, everything
should land in the build dir, and this is only likely to obscure issues
with custom build rules.
C.f. systemd#20429 (comment)
@jameshilliard jameshilliard deleted the bpf branch December 16, 2021 21:55
@jameshilliard
Copy link
Contributor Author

@wat-ze-hex note that with recent enough bpftool (we can lookup exact minimal version) BPF static linking is supported, which is a good feature in itself (allows to break up BPF files into smaller, potentially re-usable .bpf.c files), but it also make llvm-strip unnecessary (bpftool gen object command strips away all DWARF info anyways). So if systemd is fine with bumping whatever minimal bpftool version dependency they have, you might simplify build a tiny bit, but also will get static linking for BPF sources.

@anakryiko FYI I've implemented this in #22317 with a fallback so that we don't have to bump the bpftool minimum version.

Yamakuzure pushed a commit to elogind/elogind that referenced this pull request Dec 7, 2022
Those made sense when autotoolz were used. With meson, everything
should land in the build dir, and this is only likely to obscure issues
with custom build rules.
C.f. systemd/systemd#20429 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

7 participants