Skip to content

Conversation

@jkartseva
Copy link

The present support of custom bpf cgroup programs was introduced by #12419 and is limited to a specific use case of ip filtering with filters passed as cgroup_skb bpf programs loaded to bpffs. The programs are attached with hardcoded BPF_F_ALLOW_MULTI flag.
Extend bpf infra by allowing other bpf cgroup prog types along with configurable attachment flags.
Introduce BpfCgroupProgram unit file option accepting a bpf program specifier consisting from attach type, attach flags and bpffs path. Program type can be retrieved by a fd of a loaded program hence it's not a part of the interface.
The supported prog types are consistent with 5.2 kernel [1].
This PR modifies the handling of IP(Ingress|Egress)FilterPath so it is built in the new pipeline.
The absence of libbpf is a limitation hence there is a need of copy-pasting code of libbpf helpers and bpf uapi. The good news is Fedora packaging process has been kicked off [2].
Aside from libbpf a follow up of this PR is to completely separate the logic of custom bpf cgroup progs from bpf_firewall subsystem. It's not done in this PR to reduce the number of non functional changes and simplify the reviewers' work. This PR may be split to separate functional and non-functional changes.

431c206 adds libbpf helpers;
713fa80 adds a specifier struct for bpf program;
03fa374 and 31a6956 modify the handling of IP(Ingress|Egress)FilterPath options so they are a processed in a generic manner;
604e7f4 and cba1fbc pass attach_type and attach_flags from the specifier into the depths of bpf_firewall;
cbfa52a syncs linux/bpf.h with 5.2 kernel;
9c881ae adds string utils for converting attach_(type|flags) to string and vise versa;
3410c68 adds BpfCgroupProgram option to unit file and its handling and parser ut;
3410c68 extends bpf-test with several BpfCgroupProgram instances;
52f1660 adds documentation.

[1] https://elixir.bootlin.com/linux/v5.2.13/source/include/uapi/linux/bpf.h
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1745478

@jkartseva jkartseva changed the title bpf: expand bpf cgroup program support bpf: extend bpf cgroup program support Sep 7, 2019
@yuwata yuwata added cgroups ci-fails/needs-rework 🔥 Please rework this, the CI noticed an issue with the PR pid1 labels Sep 8, 2019
@jkartseva jkartseva force-pushed the custom-bpf-progs-parameterized-3 branch from 52f1660 to ee9073b Compare September 16, 2019 09:58
@mrc0mmand
Copy link
Member

Tests under nspawn seem to fail with:

systemd-update-done.service: ConditionNeedsUpdate=|/var succeeded.
systemd-update-done.service: ConditionNeedsUpdate=|/etc succeeded.
systemd-update-done.service: BPF_F_ALLOW_MULTI not supported on this manager, cannot attach custom BPF programs.
systemd-update-done.service: Failed to run 'start' task: Operation not supported
systemd-update-done.service: Failed with result 'resources'.
systemd-update-done.service: Changed dead -> failed
systemd-update-done.service: Job 46 systemd-update-done.service/start finished, result=failed
[FAILED] Failed to start Update is Completed.
...
[  OK  ] Reached target Basic System.
systemd-logind.service: BPF_F_ALLOW_MULTI not supported on this manager, cannot attach custom BPF programs.
systemd-logind.service: Failed to run 'start-pre' task: Operation not supported
systemd-logind.service: Failed with result 'resources'.
systemd-logind.service: Changed dead -> failed
systemd-logind.service: Job 63 systemd-logind.service/start finished, result=failed
[FAILED] Failed to start Login Service.
...

@jkartseva jkartseva force-pushed the custom-bpf-progs-parameterized-3 branch from ee9073b to cae14c9 Compare September 16, 2019 12:21
@jkartseva
Copy link
Author

@mrc0mmand

Tests under nspawn seem to fail with:

Thanks, I root caused that failure. Now I'm running into

Traceback (most recent call last):
  File "./agent-control.py", line 388, in <module>
    node, ssid = ac.allocate_node(args.version, args.arch)
  File "./agent-control.py", line 104, in allocate_node
    host = jroot["hosts"][0]
UnboundLocalError: local variable 'jroot' referenced before assignment
Traceback (most recent call last):
  File "./agent-control.py", line 466, in <module>
    ac.free_session(ssid)
NameError: name 'ssid' is not defined

which seems to be a problem with the test itself.

@mrc0mmand
Copy link
Member

which seems to be a problem with the test itself.

That's right, there was a brief outage in the CentOS CI infra, which should be fixed by now. I re-triggered all affected jobs, so the errors should go away once there's a free spot in the queue (an ~hour or so right now).

@mrc0mmand
Copy link
Member

Now there's a few memory leaks:

==40499==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 64 byte(s) in 1 object(s) allocated from:
    #0 0x7f55365605a8 in calloc (/usr/lib/clang/8.0.1/lib/linux/libclang_rt.asan-x86_64.so+0x1305a8)
    #1 0x7f5535b15a68 in bpf_program_new /build/build/../src/shared/bpf-program.c:19:13
    #2 0x55d15075e9aa in test_bpf_cgroup_program /build/build/../src/test/test-bpf.c:113:21
    #3 0x55d15075db1d in test_bpf_cgroup_programs /build/build/../src/test/test-bpf.c:145:9
    #4 0x55d15075bcb6 in main /build/build/../src/test/test-bpf.c:369:9
    #5 0x7f553524eee2 in __libc_start_main (/lib64/libc.so.6+0x26ee2)

Direct leak of 64 byte(s) in 1 object(s) allocated from:
    #0 0x7f5536560820 in realloc (/usr/lib/clang/8.0.1/lib/linux/libclang_rt.asan-x86_64.so+0x130820)
    #1 0x7f5535c7d2bf in greedy_realloc /build/build/../src/basic/alloc-util.c:63:13
    #2 0x7f5535b1629d in bpf_program_add_instructions /build/build/../src/shared/bpf-program.c:61:14
    #3 0x55d15075e9db in test_bpf_cgroup_program /build/build/../src/test/test-bpf.c:117:21
    #4 0x55d15075db1d in test_bpf_cgroup_programs /build/build/../src/test/test-bpf.c:145:9
    #5 0x55d15075bceb in main /build/build/../src/test/test-bpf.c:370:9
    #6 0x7f553524eee2 in __libc_start_main (/lib64/libc.so.6+0x26ee2)

Direct leak of 64 byte(s) in 1 object(s) allocated from:
    #0 0x7f5536560820 in realloc (/usr/lib/clang/8.0.1/lib/linux/libclang_rt.asan-x86_64.so+0x130820)
    #1 0x7f5535c7d2bf in greedy_realloc /build/build/../src/basic/alloc-util.c:63:13
    #2 0x7f5535b1629d in bpf_program_add_instructions /build/build/../src/shared/bpf-program.c:61:14
    #3 0x55d15075e9db in test_bpf_cgroup_program /build/build/../src/test/test-bpf.c:117:21
    #4 0x55d15075db1d in test_bpf_cgroup_programs /build/build/../src/test/test-bpf.c:145:9
    #5 0x55d15075bcb6 in main /build/build/../src/test/test-bpf.c:369:9
    #6 0x7f553524eee2 in __libc_start_main (/lib64/libc.so.6+0x26ee2)

Direct leak of 64 byte(s) in 1 object(s) allocated from:
    #0 0x7f55365605a8 in calloc (/usr/lib/clang/8.0.1/lib/linux/libclang_rt.asan-x86_64.so+0x1305a8)
    #1 0x7f5535b15a68 in bpf_program_new /build/build/../src/shared/bpf-program.c:19:13
    #2 0x55d15075e9aa in test_bpf_cgroup_program /build/build/../src/test/test-bpf.c:113:21
    #3 0x55d15075db1d in test_bpf_cgroup_programs /build/build/../src/test/test-bpf.c:145:9
    #4 0x55d15075bceb in main /build/build/../src/test/test-bpf.c:370:9
    #5 0x7f553524eee2 in __libc_start_main (/lib64/libc.so.6+0x26ee2)

Direct leak of 64 byte(s) in 1 object(s) allocated from:
    #0 0x7f5536560820 in realloc (/usr/lib/clang/8.0.1/lib/linux/libclang_rt.asan-x86_64.so+0x130820)
    #1 0x7f5535c7d2bf in greedy_realloc /build/build/../src/basic/alloc-util.c:63:13
    #2 0x7f5535b1629d in bpf_program_add_instructions /build/build/../src/shared/bpf-program.c:61:14
    #3 0x55d15075e9db in test_bpf_cgroup_program /build/build/../src/test/test-bpf.c:117:21
    #4 0x55d15075db1d in test_bpf_cgroup_programs /build/build/../src/test/test-bpf.c:145:9
    #5 0x55d15075bc81 in main /build/build/../src/test/test-bpf.c:368:9
    #6 0x7f553524eee2 in __libc_start_main (/lib64/libc.so.6+0x26ee2)

Indirect leak of 64 byte(s) in 1 object(s) allocated from:
    #0 0x7f5536560820 in realloc (/usr/lib/clang/8.0.1/lib/linux/libclang_rt.asan-x86_64.so+0x130820)
    #1 0x7f5535c7d2bf in greedy_realloc /build/build/../src/basic/alloc-util.c:63:13
    #2 0x7f5535b1629d in bpf_program_add_instructions /build/build/../src/shared/bpf-program.c:61:14
    #3 0x55d15075e9db in test_bpf_cgroup_program /build/build/../src/test/test-bpf.c:117:21
    #4 0x55d15075db1d in test_bpf_cgroup_programs /build/build/../src/test/test-bpf.c:145:9
    #5 0x55d15075bceb in main /build/build/../src/test/test-bpf.c:370:9
    #6 0x7f553524eee2 in __libc_start_main (/lib64/libc.so.6+0x26ee2)

Indirect leak of 64 byte(s) in 1 object(s) allocated from:
    #0 0x7f5536560820 in realloc (/usr/lib/clang/8.0.1/lib/linux/libclang_rt.asan-x86_64.so+0x130820)
    #1 0x7f5535c7d2bf in greedy_realloc /build/build/../src/basic/alloc-util.c:63:13
    #2 0x7f5535b1629d in bpf_program_add_instructions /build/build/../src/shared/bpf-program.c:61:14
    #3 0x55d15075e9db in test_bpf_cgroup_program /build/build/../src/test/test-bpf.c:117:21
    #4 0x55d15075db1d in test_bpf_cgroup_programs /build/build/../src/test/test-bpf.c:145:9
    #5 0x55d15075bcb6 in main /build/build/../src/test/test-bpf.c:369:9
    #6 0x7f553524eee2 in __libc_start_main (/lib64/libc.so.6+0x26ee2)

SUMMARY: AddressSanitizer: 448 byte(s) leaked in 7 allocation(s).

Full log: https://ci.centos.org/job/systemd-pr-build-vagrant-sanitizers/1889/artifact//systemd-centos-ci/artifacts_all/artifacts_h9QjQ3/vagrant-logs.Hqq/vagrant-arch-sanitizers-clang-testsuite.kiV/ninja-test_sanitizers_FAIL.log

@jkartseva jkartseva force-pushed the custom-bpf-progs-parameterized-3 branch from cae14c9 to 1f601e3 Compare September 17, 2019 04:23
@jkartseva
Copy link
Author

@mrc0mmand
Thanks, fixed the leak in test-bpf.c.

@mrc0mmand mrc0mmand removed the ci-fails/needs-rework 🔥 Please rework this, the CI noticed an issue with the PR label Sep 17, 2019
@cdown
Copy link
Member

cdown commented Sep 24, 2019

@rgushchin, maybe you have comments? This is tagged with cgroups but I don't really feel qualified to express an opinion on something of this size in BPF land.

@rgushchin
Copy link
Contributor

I like the idea, and it's definitely useful!
I've a question about the proposed interface: do we use json strings anywhere else?
Also, it's not very clear to me if it's ok to pass multiple BpfCgroupProgram options (I guess yes), and what will happen in some edge cases (e.g. multiple options without the "multi" flag).
Also I don't really like yet another attach type string <-> int conversion table, we have them everywhere, but I've no recipe how to make it better. Maybe it's something that libbpf can do?

@cdown cdown added the bpf label Sep 24, 2019
@jkartseva
Copy link
Author

jkartseva commented Sep 26, 2019

I like the idea, and it's definitely useful!

Thanks :)

I've a question about the proposed interface: do we use json strings anywhere else?

I came up with JSON because it's easy to extend and parse given that corresponding parsing tools are already present in systemd, though agree that curly braces may be confusing to a user.
I'm fine to change the input string to smth like
attach_type:cgroup_egress bpffs_path:/path/in/bpffs

Also, it's not very clear to me if it's ok to pass multiple BpfCgroupProgram options (I guess yes),

Yes, passing multiple BpfCgroupProgram options will work. Program specifiers append to a list.

and what will happen in some edge cases (e.g. multiple options without the "multi" flag).

Good point, will write a UT covering forgotten "multi" flag case.

Also I don't really like yet another attach type string <-> int conversion table, we have them everywhere, but I've no recipe how to make it better. Maybe it's something that libbpf can do?

  1. libbpf RPM is not yet available
  2. libbpf can't do that yet but I have a patch set for that. TL;DR: needs v2.

@rgushchin
Copy link
Contributor

I like the idea, and it's definitely useful!

Thanks :)

I've a question about the proposed interface: do we use json strings anywhere else?

I came up with JSON because it's easy to extend and parse given that corresponding parsing tools are already present in systemd, though agree that curly braces may be confusing to a user.
I'm fine to change the input string to smth like
attach_type:cgroup_egress bpffs_path:/path/in/bpffs

Maybe @poettering can advice, how the interface should look like here.
No reasons to make it be bpf-specific.

Also, it's not very clear to me if it's ok to pass multiple BpfCgroupProgram options (I guess yes),

Yes, passing multiple BpfCgroupProgram options will work. Program specifiers append to a list.

and what will happen in some edge cases (e.g. multiple options without the "multi" flag).

Good point, will write a UT covering forgotten "multi" flag case.

Also I don't really like yet another attach type string <-> int conversion table, we have them everywhere, but I've no recipe how to make it better. Maybe it's something that libbpf can do?

  1. libbpf RPM is not yet available
  2. libbpf can't do that yet but I have a patch set for that. TL;DR: needs v2.

Cool, it's really great!

@poettering
Copy link
Member

Heya, sorry for not reviewing this earlier, this somehow managed to escape me...

Hmm, so I guess the general concept is acceptable I think we should put more emphasis on making the BPF stuff more implementation detail than primary feature. i.e. I'd love to see the bind() stuff being exposed as AllowPorts= option that is usable for regular admins and does not require a phd in advanced bpfology. hat said, there is probably also value somewhere to also allow generic programs to be attached but this should be something that we also do, and not primarily... hence, I am not dismissing this PR altogether, it just appears to me it would be better to expose this in a more high-level API first...

Using JSON for this sounds unnecessary, it's just a triplet of strings and we need no full blown object embedded here... Moreover we never used JSON embedded in unit files so far, and this doesn't look like the right place to use it... In other simple cases like this we just used a colon separator for this, i.e.

BPFProgram=egress:/some/path

I am not convinced we should make override/multi configurable at this point, we don't make it configurable for the other cases either... Let's for now stick to automatic determination of the flags like we do for the other bpf progs. I mean ideally we'd never allow subcgroups to unmask what their parent slices installed, cgroups are nested for a reason after all...

anyway, closer review follows.

Copy link
Member

@poettering poettering left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry again for the late review.

@poettering
Copy link
Member

anyway, just to clarify my comments above: this can go in (of course with the code review points above fixed), I just hope that we get the higher-level features based on BPF too, i.e. AllowPorts= and such

@poettering poettering added the reviewed/needs-rework 🔨 PR has been reviewed and needs another round of reworks label Jan 3, 2020
@jkartseva jkartseva force-pushed the custom-bpf-progs-parameterized-3 branch from 1f601e3 to 79ef15f Compare September 18, 2020 19:49
@jkartseva
Copy link
Author

jkartseva commented Sep 18, 2020

@poettering

UPDATE: don't mind this comment, it is outdated.

Hello, thanks for reviewing and sorry for the long v2. Hope the pandemic goes easy on you.
This work is not abandoned, on the contrary, introducing v2.

v2 is a major rework. It refactors the existing BPF infra, specifically moving to abstractions such as
cgroup-bpf resource to hold fds of attached BPF objects and cgroup bpf context to store inputs form the fragment parsers of BPF-related options.
The goal is to accumulate the knowledge about supported BPF features in a single place and to provide a convenient interface for feature code.
Take attach flags as an example, e.g. bpf_firewall.
Currently to determine the proper attach flags each feature have to know about other features of the same attach type. This is bad due to:

  • When writing a new feature, a programmer have to modify the code of existing features of the same type.
  • A programmer have to have end-to-end knowledge about BPF infra and all the features supported.

To bring it to a common denominator, cgroup_bpf_get_suggested_attach_flags method is introduced. It checks cgroup target mask to get info about required programs of the unit and its children. Also it counts how many BPF programs are requested e.g. with IP(Ingress|Egress)FilterPath option.


Going further, my proposal regarding libbpf dependency: let's not introduce it for the legacy code but make it a requirement for new features depending on syscalls which do not present in the legacy syscall wrappers lib.
Wrap LIBBPF_API function calls needed for the new features in conditional macro:

#if HAVE_LIBBPF
//call LIBBPF_API function
#else
-ENOTSUP
#endif

For now, leave the existing syscalls, e.g. BPF_PROG_(LOAD|ATTACH) as is. This will guarantee that legacy features won't break if dependency do not present.

BTW, libbpf gained a good representation in distributions, including Debian and Arch.

Next question to address is to make libbpf dependency static. Currently it's dynamic for consistency with other deps, but to eliminate runtime requirement, it should be statically linked.

The last one, this PR introducesBPFProgram= option and its harness: fragment parser, cgroup context, group support.

This PR is rather large and combines refactoring and new feature code, I'm fine with focusing on a part of it followed by another PR.

Hmm, so I guess the general concept is acceptable I think we should put more emphasis on making the BPF stuff more implementation detail than primary feature. i.e. I'd love to see the bind() stuff being exposed as AllowPorts=

That's my plans for the future work: AllowPorts= backed by cgroup/bind{4|6} hooks and connection-based firewall for cgroup/connect{4|6} hooks. This will go as the first instance of BPF program in a form of source code compiled along with the rest of systemd.

Sorry again for late v2 and thanks.

@jkartseva jkartseva force-pushed the custom-bpf-progs-parameterized-3 branch 4 times, most recently from 3d8df8b to 5c9a9a5 Compare September 19, 2020 08:44
@keszybz
Copy link
Member

keszybz commented Apr 12, 2021

Thanks.

@keszybz keszybz merged commit 839eb4a into systemd:main Apr 12, 2021
@jkartseva jkartseva deleted the custom-bpf-progs-parameterized-3 branch April 12, 2021 17:41
jkartseva pushed a commit to jkartseva/systemd that referenced this pull request Apr 17, 2021
Introduce BPF program compiled from BPF source code in
restricted C - socket_bind.
It addresses feature request [0].

The goal is to allow systemd services to bind(2) only to a predefined set
of ports. This prevents assigning socket address with unallowed port
to a socket and creating servers listening on that port.

This compliments firewalling feature presenting in systemd:
whereas cgroup/{egress|ingress} hooks act on packets, this doesn't
protect from untrusted service or payload hijacking an important port.

While ports in 0-1023 range are restricted to root only, 1024-65535
range is not protected by any mean.

Performance is another aspect of socket_bind feature since per-packet
cost can be eliminated for some port-based filtering policies.

The feature is implemented with cgroup/bind{4|6} hooks [1].
In contrast to the present systemd approach using raw bpf instructions,
this program is compiled from sources. Stretch goal is to
make bpf ecosystem in systemd more friendly for developer and to clear
path for more BPF programs.

[0] systemd#13496 (comment)
[1] https://www.spinics.net/lists/netdev/msg489054.html
jkartseva pushed a commit to jkartseva/systemd that referenced this pull request Apr 19, 2021
Introduce BPF program compiled from BPF source code in
restricted C - socket_bind.
It addresses feature request [0].

The goal is to allow systemd services to bind(2) only to a predefined set
of ports. This prevents assigning socket address with unallowed port
to a socket and creating servers listening on that port.

This compliments firewalling feature presenting in systemd:
whereas cgroup/{egress|ingress} hooks act on packets, this doesn't
protect from untrusted service or payload hijacking an important port.

While ports in 0-1023 range are restricted to root only, 1024-65535
range is not protected by any mean.

Performance is another aspect of socket_bind feature since per-packet
cost can be eliminated for some port-based filtering policies.

The feature is implemented with cgroup/bind{4|6} hooks [1].
In contrast to the present systemd approach using raw bpf instructions,
this program is compiled from sources. Stretch goal is to
make bpf ecosystem in systemd more friendly for developer and to clear
path for more BPF programs.

[0] systemd#13496 (comment)
[1] https://www.spinics.net/lists/netdev/msg489054.html
jkartseva pushed a commit to jkartseva/systemd that referenced this pull request Apr 19, 2021
Introduce BPF program compiled from BPF source code in
restricted C - socket_bind.
It addresses feature request [0].

The goal is to allow systemd services to bind(2) only to a predefined set
of ports. This prevents assigning socket address with unallowed port
to a socket and creating servers listening on that port.

This compliments firewalling feature presenting in systemd:
whereas cgroup/{egress|ingress} hooks act on packets, this doesn't
protect from untrusted service or payload hijacking an important port.

While ports in 0-1023 range are restricted to root only, 1024-65535
range is not protected by any mean.

Performance is another aspect of socket_bind feature since per-packet
cost can be eliminated for some port-based filtering policies.

The feature is implemented with cgroup/bind{4|6} hooks [1].
In contrast to the present systemd approach using raw bpf instructions,
this program is compiled from sources. Stretch goal is to
make bpf ecosystem in systemd more friendly for developer and to clear
path for more BPF programs.

[0] systemd#13496 (comment)
[1] https://www.spinics.net/lists/netdev/msg489054.html
jkartseva pushed a commit to jkartseva/systemd that referenced this pull request Apr 19, 2021
Introduce BPF program compiled from BPF source code in
restricted C - socket_bind.
It addresses feature request [0].

The goal is to allow systemd services to bind(2) only to a predefined set
of ports. This prevents assigning socket address with unallowed port
to a socket and creating servers listening on that port.

This compliments firewalling feature presenting in systemd:
whereas cgroup/{egress|ingress} hooks act on packets, this doesn't
protect from untrusted service or payload hijacking an important port.

While ports in 0-1023 range are restricted to root only, 1024-65535
range is not protected by any mean.

Performance is another aspect of socket_bind feature since per-packet
cost can be eliminated for some port-based filtering policies.

The feature is implemented with cgroup/bind{4|6} hooks [1].
In contrast to the present systemd approach using raw bpf instructions,
this program is compiled from sources. Stretch goal is to
make bpf ecosystem in systemd more friendly for developer and to clear
path for more BPF programs.

[0] systemd#13496 (comment)
[1] https://www.spinics.net/lists/netdev/msg489054.html
jkartseva pushed a commit to jkartseva/systemd that referenced this pull request Apr 20, 2021
Introduce BPF program compiled from BPF source code in
restricted C - socket_bind.
It addresses feature request [0].

The goal is to allow systemd services to bind(2) only to a predefined set
of ports. This prevents assigning socket address with unallowed port
to a socket and creating servers listening on that port.

This compliments firewalling feature presenting in systemd:
whereas cgroup/{egress|ingress} hooks act on packets, this doesn't
protect from untrusted service or payload hijacking an important port.

While ports in 0-1023 range are restricted to root only, 1024-65535
range is not protected by any mean.

Performance is another aspect of socket_bind feature since per-packet
cost can be eliminated for some port-based filtering policies.

The feature is implemented with cgroup/bind{4|6} hooks [1].
In contrast to the present systemd approach using raw bpf instructions,
this program is compiled from sources. Stretch goal is to
make bpf ecosystem in systemd more friendly for developer and to clear
path for more BPF programs.

[0] systemd#13496 (comment)
[1] https://www.spinics.net/lists/netdev/msg489054.html
jkartseva pushed a commit to jkartseva/systemd that referenced this pull request Apr 20, 2021
Introduce BPF program compiled from BPF source code in
restricted C - socket_bind.
It addresses feature request [0].

The goal is to allow systemd services to bind(2) only to a predefined set
of ports. This prevents assigning socket address with unallowed port
to a socket and creating servers listening on that port.

This compliments firewalling feature presenting in systemd:
whereas cgroup/{egress|ingress} hooks act on packets, this doesn't
protect from untrusted service or payload hijacking an important port.

While ports in 0-1023 range are restricted to root only, 1024-65535
range is not protected by any mean.

Performance is another aspect of socket_bind feature since per-packet
cost can be eliminated for some port-based filtering policies.

The feature is implemented with cgroup/bind{4|6} hooks [1].
In contrast to the present systemd approach using raw bpf instructions,
this program is compiled from sources. Stretch goal is to
make bpf ecosystem in systemd more friendly for developer and to clear
path for more BPF programs.

[0] systemd#13496 (comment)
[1] https://www.spinics.net/lists/netdev/msg489054.html
jkartseva pushed a commit to jkartseva/systemd that referenced this pull request Apr 20, 2021
Introduce BPF program compiled from BPF source code in
restricted C - socket_bind.
It addresses feature request [0].

The goal is to allow systemd services to bind(2) only to a predefined set
of ports. This prevents assigning socket address with unallowed port
to a socket and creating servers listening on that port.

This compliments firewalling feature presenting in systemd:
whereas cgroup/{egress|ingress} hooks act on packets, this doesn't
protect from untrusted service or payload hijacking an important port.

While ports in 0-1023 range are restricted to root only, 1024-65535
range is not protected by any mean.

Performance is another aspect of socket_bind feature since per-packet
cost can be eliminated for some port-based filtering policies.

The feature is implemented with cgroup/bind{4|6} hooks [1].
In contrast to the present systemd approach using raw bpf instructions,
this program is compiled from sources. Stretch goal is to
make bpf ecosystem in systemd more friendly for developer and to clear
path for more BPF programs.

[0] systemd#13496 (comment)
[1] https://www.spinics.net/lists/netdev/msg489054.html
jkartseva pushed a commit to jkartseva/systemd that referenced this pull request Apr 20, 2021
Introduce BPF program compiled from BPF source code in
restricted C - socket_bind.
It addresses feature request [0].

The goal is to allow systemd services to bind(2) only to a predefined set
of ports. This prevents assigning socket address with unallowed port
to a socket and creating servers listening on that port.

This compliments firewalling feature presenting in systemd:
whereas cgroup/{egress|ingress} hooks act on packets, this doesn't
protect from untrusted service or payload hijacking an important port.

While ports in 0-1023 range are restricted to root only, 1024-65535
range is not protected by any mean.

Performance is another aspect of socket_bind feature since per-packet
cost can be eliminated for some port-based filtering policies.

The feature is implemented with cgroup/bind{4|6} hooks [1].
In contrast to the present systemd approach using raw bpf instructions,
this program is compiled from sources. Stretch goal is to
make bpf ecosystem in systemd more friendly for developer and to clear
path for more BPF programs.

[0] systemd#13496 (comment)
[1] https://www.spinics.net/lists/netdev/msg489054.html
jkartseva pushed a commit to jkartseva/systemd that referenced this pull request Apr 20, 2021
Introduce BPF program compiled from BPF source code in
restricted C - socket_bind.
It addresses feature request [0].

The goal is to allow systemd services to bind(2) only to a predefined set
of ports. This prevents assigning socket address with unallowed port
to a socket and creating servers listening on that port.

This compliments firewalling feature presenting in systemd:
whereas cgroup/{egress|ingress} hooks act on packets, this doesn't
protect from untrusted service or payload hijacking an important port.

While ports in 0-1023 range are restricted to root only, 1024-65535
range is not protected by any mean.

Performance is another aspect of socket_bind feature since per-packet
cost can be eliminated for some port-based filtering policies.

The feature is implemented with cgroup/bind{4|6} hooks [1].
In contrast to the present systemd approach using raw bpf instructions,
this program is compiled from sources. Stretch goal is to
make bpf ecosystem in systemd more friendly for developer and to clear
path for more BPF programs.

[0] systemd#13496 (comment)
[1] https://www.spinics.net/lists/netdev/msg489054.html
jkartseva pushed a commit to jkartseva/systemd that referenced this pull request Apr 20, 2021
Introduce BPF program compiled from BPF source code in
restricted C - socket_bind.
It addresses feature request [0].

The goal is to allow systemd services to bind(2) only to a predefined set
of ports. This prevents assigning socket address with unallowed port
to a socket and creating servers listening on that port.

This compliments firewalling feature presenting in systemd:
whereas cgroup/{egress|ingress} hooks act on packets, this doesn't
protect from untrusted service or payload hijacking an important port.

While ports in 0-1023 range are restricted to root only, 1024-65535
range is not protected by any mean.

Performance is another aspect of socket_bind feature since per-packet
cost can be eliminated for some port-based filtering policies.

The feature is implemented with cgroup/bind{4|6} hooks [1].
In contrast to the present systemd approach using raw bpf instructions,
this program is compiled from sources. Stretch goal is to
make bpf ecosystem in systemd more friendly for developer and to clear
path for more BPF programs.

[0] systemd#13496 (comment)
[1] https://www.spinics.net/lists/netdev/msg489054.html
jkartseva pushed a commit to jkartseva/systemd that referenced this pull request Apr 20, 2021
Introduce BPF program compiled from BPF source code in
restricted C - socket_bind.
It addresses feature request [0].

The goal is to allow systemd services to bind(2) only to a predefined set
of ports. This prevents assigning socket address with unallowed port
to a socket and creating servers listening on that port.

This compliments firewalling feature presenting in systemd:
whereas cgroup/{egress|ingress} hooks act on packets, this doesn't
protect from untrusted service or payload hijacking an important port.

While ports in 0-1023 range are restricted to root only, 1024-65535
range is not protected by any mean.

Performance is another aspect of socket_bind feature since per-packet
cost can be eliminated for some port-based filtering policies.

The feature is implemented with cgroup/bind{4|6} hooks [1].
In contrast to the present systemd approach using raw bpf instructions,
this program is compiled from sources. Stretch goal is to
make bpf ecosystem in systemd more friendly for developer and to clear
path for more BPF programs.

[0] systemd#13496 (comment)
[1] https://www.spinics.net/lists/netdev/msg489054.html
jkartseva pushed a commit to jkartseva/systemd that referenced this pull request Apr 21, 2021
Introduce BPF program compiled from BPF source code in
restricted C - socket_bind.
It addresses feature request [0].

The goal is to allow systemd services to bind(2) only to a predefined set
of ports. This prevents assigning socket address with unallowed port
to a socket and creating servers listening on that port.

This compliments firewalling feature presenting in systemd:
whereas cgroup/{egress|ingress} hooks act on packets, this doesn't
protect from untrusted service or payload hijacking an important port.

While ports in 0-1023 range are restricted to root only, 1024-65535
range is not protected by any mean.

Performance is another aspect of socket_bind feature since per-packet
cost can be eliminated for some port-based filtering policies.

The feature is implemented with cgroup/bind{4|6} hooks [1].
In contrast to the present systemd approach using raw bpf instructions,
this program is compiled from sources. Stretch goal is to
make bpf ecosystem in systemd more friendly for developer and to clear
path for more BPF programs.

[0] systemd#13496 (comment)
[1] https://www.spinics.net/lists/netdev/msg489054.html
jkartseva pushed a commit to jkartseva/systemd that referenced this pull request Apr 21, 2021
Introduce BPF program compiled from BPF source code in
restricted C - socket_bind.
It addresses feature request [0].

The goal is to allow systemd services to bind(2) only to a predefined set
of ports. This prevents assigning socket address with unallowed port
to a socket and creating servers listening on that port.

This compliments firewalling feature presenting in systemd:
whereas cgroup/{egress|ingress} hooks act on packets, this doesn't
protect from untrusted service or payload hijacking an important port.

While ports in 0-1023 range are restricted to root only, 1024-65535
range is not protected by any mean.

Performance is another aspect of socket_bind feature since per-packet
cost can be eliminated for some port-based filtering policies.

The feature is implemented with cgroup/bind{4|6} hooks [1].
In contrast to the present systemd approach using raw bpf instructions,
this program is compiled from sources. Stretch goal is to
make bpf ecosystem in systemd more friendly for developer and to clear
path for more BPF programs.

[0] systemd#13496 (comment)
[1] https://www.spinics.net/lists/netdev/msg489054.html
jkartseva pushed a commit to jkartseva/systemd that referenced this pull request Apr 21, 2021
Introduce BPF program compiled from BPF source code in
restricted C - socket_bind.
It addresses feature request [0].

The goal is to allow systemd services to bind(2) only to a predefined set
of ports. This prevents assigning socket address with unallowed port
to a socket and creating servers listening on that port.

This compliments firewalling feature presenting in systemd:
whereas cgroup/{egress|ingress} hooks act on packets, this doesn't
protect from untrusted service or payload hijacking an important port.

While ports in 0-1023 range are restricted to root only, 1024-65535
range is not protected by any mean.

Performance is another aspect of socket_bind feature since per-packet
cost can be eliminated for some port-based filtering policies.

The feature is implemented with cgroup/bind{4|6} hooks [1].
In contrast to the present systemd approach using raw bpf instructions,
this program is compiled from sources. Stretch goal is to
make bpf ecosystem in systemd more friendly for developer and to clear
path for more BPF programs.

[0] systemd#13496 (comment)
[1] https://www.spinics.net/lists/netdev/msg489054.html
jkartseva pushed a commit to jkartseva/systemd that referenced this pull request Apr 21, 2021
Introduce BPF program compiled from BPF source code in
restricted C - socket_bind.
It addresses feature request [0].

The goal is to allow systemd services to bind(2) only to a predefined set
of ports. This prevents assigning socket address with unallowed port
to a socket and creating servers listening on that port.

This compliments firewalling feature presenting in systemd:
whereas cgroup/{egress|ingress} hooks act on packets, this doesn't
protect from untrusted service or payload hijacking an important port.

While ports in 0-1023 range are restricted to root only, 1024-65535
range is not protected by any mean.

Performance is another aspect of socket_bind feature since per-packet
cost can be eliminated for some port-based filtering policies.

The feature is implemented with cgroup/bind{4|6} hooks [1].
In contrast to the present systemd approach using raw bpf instructions,
this program is compiled from sources. Stretch goal is to
make bpf ecosystem in systemd more friendly for developer and to clear
path for more BPF programs.

[0] systemd#13496 (comment)
[1] https://www.spinics.net/lists/netdev/msg489054.html
jkartseva pushed a commit to jkartseva/systemd that referenced this pull request Apr 24, 2021
Introduce BPF program compiled from BPF source code in
restricted C - socket_bind.
It addresses feature request [0].

The goal is to allow systemd services to bind(2) only to a predefined set
of ports. This prevents assigning socket address with unallowed port
to a socket and creating servers listening on that port.

This compliments firewalling feature presenting in systemd:
whereas cgroup/{egress|ingress} hooks act on packets, this doesn't
protect from untrusted service or payload hijacking an important port.

While ports in 0-1023 range are restricted to root only, 1024-65535
range is not protected by any mean.

Performance is another aspect of socket_bind feature since per-packet
cost can be eliminated for some port-based filtering policies.

The feature is implemented with cgroup/bind{4|6} hooks [1].
In contrast to the present systemd approach using raw bpf instructions,
this program is compiled from sources. Stretch goal is to
make bpf ecosystem in systemd more friendly for developer and to clear
path for more BPF programs.

[0] systemd#13496 (comment)
[1] https://www.spinics.net/lists/netdev/msg489054.html
jkartseva pushed a commit to jkartseva/systemd that referenced this pull request Apr 24, 2021
Introduce BPF program compiled from BPF source code in
restricted C - socket_bind.
It addresses feature request [0].

The goal is to allow systemd services to bind(2) only to a predefined set
of ports. This prevents assigning socket address with unallowed port
to a socket and creating servers listening on that port.

This compliments firewalling feature presenting in systemd:
whereas cgroup/{egress|ingress} hooks act on packets, this doesn't
protect from untrusted service or payload hijacking an important port.

While ports in 0-1023 range are restricted to root only, 1024-65535
range is not protected by any mean.

Performance is another aspect of socket_bind feature since per-packet
cost can be eliminated for some port-based filtering policies.

The feature is implemented with cgroup/bind{4|6} hooks [1].
In contrast to the present systemd approach using raw bpf instructions,
this program is compiled from sources. Stretch goal is to
make bpf ecosystem in systemd more friendly for developer and to clear
path for more BPF programs.

[0] systemd#13496 (comment)
[1] https://www.spinics.net/lists/netdev/msg489054.html
jkartseva pushed a commit to jkartseva/systemd that referenced this pull request Apr 24, 2021
Introduce BPF program compiled from BPF source code in
restricted C - socket_bind.
It addresses feature request [0].

The goal is to allow systemd services to bind(2) only to a predefined set
of ports. This prevents assigning socket address with unallowed port
to a socket and creating servers listening on that port.

This compliments firewalling feature presenting in systemd:
whereas cgroup/{egress|ingress} hooks act on packets, this doesn't
protect from untrusted service or payload hijacking an important port.

While ports in 0-1023 range are restricted to root only, 1024-65535
range is not protected by any mean.

Performance is another aspect of socket_bind feature since per-packet
cost can be eliminated for some port-based filtering policies.

The feature is implemented with cgroup/bind{4|6} hooks [1].
In contrast to the present systemd approach using raw bpf instructions,
this program is compiled from sources. Stretch goal is to
make bpf ecosystem in systemd more friendly for developer and to clear
path for more BPF programs.

[0] systemd#13496 (comment)
[1] https://www.spinics.net/lists/netdev/msg489054.html
jkartseva pushed a commit to jkartseva/systemd that referenced this pull request Apr 25, 2021
Introduce BPF program compiled from BPF source code in
restricted C - socket_bind.
It addresses feature request [0].

The goal is to allow systemd services to bind(2) only to a predefined set
of ports. This prevents assigning socket address with unallowed port
to a socket and creating servers listening on that port.

This compliments firewalling feature presenting in systemd:
whereas cgroup/{egress|ingress} hooks act on packets, this doesn't
protect from untrusted service or payload hijacking an important port.

While ports in 0-1023 range are restricted to root only, 1024-65535
range is not protected by any mean.

Performance is another aspect of socket_bind feature since per-packet
cost can be eliminated for some port-based filtering policies.

The feature is implemented with cgroup/bind{4|6} hooks [1].
In contrast to the present systemd approach using raw bpf instructions,
this program is compiled from sources. Stretch goal is to
make bpf ecosystem in systemd more friendly for developer and to clear
path for more BPF programs.

[0] systemd#13496 (comment)
[1] https://www.spinics.net/lists/netdev/msg489054.html
jkartseva pushed a commit to jkartseva/systemd that referenced this pull request Apr 26, 2021
Introduce BPF program compiled from BPF source code in
restricted C - socket_bind.
It addresses feature request [0].

The goal is to allow systemd services to bind(2) only to a predefined set
of ports. This prevents assigning socket address with unallowed port
to a socket and creating servers listening on that port.

This compliments firewalling feature presenting in systemd:
whereas cgroup/{egress|ingress} hooks act on packets, this doesn't
protect from untrusted service or payload hijacking an important port.

While ports in 0-1023 range are restricted to root only, 1024-65535
range is not protected by any mean.

Performance is another aspect of socket_bind feature since per-packet
cost can be eliminated for some port-based filtering policies.

The feature is implemented with cgroup/bind{4|6} hooks [1].
In contrast to the present systemd approach using raw bpf instructions,
this program is compiled from sources. Stretch goal is to
make bpf ecosystem in systemd more friendly for developer and to clear
path for more BPF programs.

[0] systemd#13496 (comment)
[1] https://www.spinics.net/lists/netdev/msg489054.html
jkartseva pushed a commit to jkartseva/systemd that referenced this pull request Apr 26, 2021
Introduce BPF program compiled from BPF source code in
restricted C - socket_bind.
It addresses feature request [0].

The goal is to allow systemd services to bind(2) only to a predefined set
of ports. This prevents assigning socket address with unallowed port
to a socket and creating servers listening on that port.

This compliments firewalling feature presenting in systemd:
whereas cgroup/{egress|ingress} hooks act on packets, this doesn't
protect from untrusted service or payload hijacking an important port.

While ports in 0-1023 range are restricted to root only, 1024-65535
range is not protected by any mean.

Performance is another aspect of socket_bind feature since per-packet
cost can be eliminated for some port-based filtering policies.

The feature is implemented with cgroup/bind{4|6} hooks [1].
In contrast to the present systemd approach using raw bpf instructions,
this program is compiled from sources. Stretch goal is to
make bpf ecosystem in systemd more friendly for developer and to clear
path for more BPF programs.

[0] systemd#13496 (comment)
[1] https://www.spinics.net/lists/netdev/msg489054.html
jkartseva pushed a commit to jkartseva/systemd that referenced this pull request Apr 26, 2021
Introduce BPF program compiled from BPF source code in
restricted C - socket_bind.
It addresses feature request [0].

The goal is to allow systemd services to bind(2) only to a predefined set
of ports. This prevents assigning socket address with unallowed port
to a socket and creating servers listening on that port.

This compliments firewalling feature presenting in systemd:
whereas cgroup/{egress|ingress} hooks act on packets, this doesn't
protect from untrusted service or payload hijacking an important port.

While ports in 0-1023 range are restricted to root only, 1024-65535
range is not protected by any mean.

Performance is another aspect of socket_bind feature since per-packet
cost can be eliminated for some port-based filtering policies.

The feature is implemented with cgroup/bind{4|6} hooks [1].
In contrast to the present systemd approach using raw bpf instructions,
this program is compiled from sources. Stretch goal is to
make bpf ecosystem in systemd more friendly for developer and to clear
path for more BPF programs.

[0] systemd#13496 (comment)
[1] https://www.spinics.net/lists/netdev/msg489054.html
jkartseva pushed a commit to jkartseva/systemd that referenced this pull request Apr 26, 2021
Introduce BPF program compiled from BPF source code in
restricted C - socket-bind.
It addresses feature request [0].

The goal is to allow systemd services to bind(2) only to a predefined set
of ports. This prevents assigning socket address with unallowed port
to a socket and creating servers listening on that port.

This compliments firewalling feature presenting in systemd:
whereas cgroup/{egress|ingress} hooks act on packets, this doesn't
protect from untrusted service or payload hijacking an important port.

While ports in 0-1023 range are restricted to root only, 1024-65535
range is not protected by any mean.

Performance is another aspect of socket_bind feature since per-packet
cost can be eliminated for some port-based filtering policies.

The feature is implemented with cgroup/bind{4|6} hooks [1].
In contrast to the present systemd approach using raw bpf instructions,
this program is compiled from sources. Stretch goal is to
make bpf ecosystem in systemd more friendly for developer and to clear
path for more BPF programs.

[0] systemd#13496 (comment)
[1] https://www.spinics.net/lists/netdev/msg489054.html
jkartseva pushed a commit to jkartseva/systemd that referenced this pull request Apr 26, 2021
Introduce BPF program compiled from BPF source code in
restricted C - socket-bind.
It addresses feature request [0].

The goal is to allow systemd services to bind(2) only to a predefined set
of ports. This prevents assigning socket address with unallowed port
to a socket and creating servers listening on that port.

This compliments firewalling feature presenting in systemd:
whereas cgroup/{egress|ingress} hooks act on packets, this doesn't
protect from untrusted service or payload hijacking an important port.

While ports in 0-1023 range are restricted to root only, 1024-65535
range is not protected by any mean.

Performance is another aspect of socket_bind feature since per-packet
cost can be eliminated for some port-based filtering policies.

The feature is implemented with cgroup/bind{4|6} hooks [1].
In contrast to the present systemd approach using raw bpf instructions,
this program is compiled from sources. Stretch goal is to
make bpf ecosystem in systemd more friendly for developer and to clear
path for more BPF programs.

[0] systemd#13496 (comment)
[1] https://www.spinics.net/lists/netdev/msg489054.html
jkartseva pushed a commit to jkartseva/systemd that referenced this pull request Apr 26, 2021
Introduce BPF program compiled from BPF source code in
restricted C - socket-bind.
It addresses feature request [0].

The goal is to allow systemd services to bind(2) only to a predefined set
of ports. This prevents assigning socket address with unallowed port
to a socket and creating servers listening on that port.

This compliments firewalling feature presenting in systemd:
whereas cgroup/{egress|ingress} hooks act on packets, this doesn't
protect from untrusted service or payload hijacking an important port.

While ports in 0-1023 range are restricted to root only, 1024-65535
range is not protected by any mean.

Performance is another aspect of socket_bind feature since per-packet
cost can be eliminated for some port-based filtering policies.

The feature is implemented with cgroup/bind{4|6} hooks [1].
In contrast to the present systemd approach using raw bpf instructions,
this program is compiled from sources. Stretch goal is to
make bpf ecosystem in systemd more friendly for developer and to clear
path for more BPF programs.

[0] systemd#13496 (comment)
[1] https://www.spinics.net/lists/netdev/msg489054.html
mrc0mmand pushed a commit to mrc0mmand/rhel-9 that referenced this pull request Jul 2, 2021
Introduce BPF program compiled from BPF source code in
restricted C - socket-bind.
It addresses feature request [0].

The goal is to allow systemd services to bind(2) only to a predefined set
of ports. This prevents assigning socket address with unallowed port
to a socket and creating servers listening on that port.

This compliments firewalling feature presenting in systemd:
whereas cgroup/{egress|ingress} hooks act on packets, this doesn't
protect from untrusted service or payload hijacking an important port.

While ports in 0-1023 range are restricted to root only, 1024-65535
range is not protected by any mean.

Performance is another aspect of socket_bind feature since per-packet
cost can be eliminated for some port-based filtering policies.

The feature is implemented with cgroup/bind{4|6} hooks [1].
In contrast to the present systemd approach using raw bpf instructions,
this program is compiled from sources. Stretch goal is to
make bpf ecosystem in systemd more friendly for developer and to clear
path for more BPF programs.

[0] systemd/systemd#13496 (comment)
[1] https://www.spinics.net/lists/netdev/msg489054.html
mrc0mmand pushed a commit to mrc0mmand/rhel-9 that referenced this pull request Jul 2, 2021
Introduce BPF program compiled from BPF source code in
restricted C - socket-bind.
It addresses feature request [0].

The goal is to allow systemd services to bind(2) only to a predefined set
of ports. This prevents assigning socket address with unallowed port
to a socket and creating servers listening on that port.

This compliments firewalling feature presenting in systemd:
whereas cgroup/{egress|ingress} hooks act on packets, this doesn't
protect from untrusted service or payload hijacking an important port.

While ports in 0-1023 range are restricted to root only, 1024-65535
range is not protected by any mean.

Performance is another aspect of socket_bind feature since per-packet
cost can be eliminated for some port-based filtering policies.

The feature is implemented with cgroup/bind{4|6} hooks [1].
In contrast to the present systemd approach using raw bpf instructions,
this program is compiled from sources. Stretch goal is to
make bpf ecosystem in systemd more friendly for developer and to clear
path for more BPF programs.

[0] systemd/systemd#13496 (comment)
[1] https://www.spinics.net/lists/netdev/msg489054.html
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.