Skip to content

Conversation

@pothos
Copy link
Contributor

@pothos pothos commented Apr 26, 2019

Takes a single /sys/fs/bpf/pinned_prog string as argument but may be
specified multiple times. An empty assignment resets all previous filters.

Closes #10227

I propose to use filter paths and not a single boolean with a predefined path to lookup the pinned BPF program. With the chosen approach of this PR multiple filters can be combined, e.g., to attach a filter for HTTP contents and a separate filter for certain IP addresses. It also allows to reuse filters between units where, e.g., multiple units could depend on a oneshot unit that loads the filter at a certain place.

I decided for absolute paths since it makes reading and using these values elsewhere easy. It is more clear for users to see what should be pinned where in order to work (and other directives do also not strip away /dev/ in /dev/sda1).

Does this make sense? I can change it to use relative paths etc. Did I forget something such as cleanup macros?
Currently, as with IPAddressDeny it is not a failure if the BPF program cannot be attached or bpf_program_new fails. The error is logged but the unit will still start. Should this be more strict since the IPAddressDeny options are defaults somewhere but this one here not?

What about weight 500 in the security analysis, is it too high? It's not recommended to have a custom filter but having it present may indicate a stronger firewalling. If egress, ingress or both are firewalled depends on the situation, so there is varying score, just on or off.

@pothos
Copy link
Contributor Author

pothos commented Apr 26, 2019

One more note about boolean arguments: How would they work with systemd-run?

Copy link
Member

@poettering poettering left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks pretty good. a number of comments though

@poettering poettering added pid1 reviewed/needs-rework 🔨 PR has been reviewed and needs another round of reworks labels Apr 29, 2019
@poettering
Copy link
Member

Should this be more strict since the IPAddressDeny options are defaults somewhere but this one here not?

I think so, yes. I mean, the reason the IPAddressAllow= and IPAddressDeny= lines are gracefully skipped is that we want people to use them by default in upstream packaging (for example to limit traffic to localhost), and things shall still work when these unit files are used in environments where bpf is not available (because the kernel doesn't do it, or running in a container or such). But since these kind of manual bpf programs don't make much sense in similar upstream scenarios, since they always need manual setup, i would say these ones should be fatal by default, if they can't be installed.

@pothos pothos force-pushed the ip_filter_bpf_progs branch from db90ec5 to 1f0f3e7 Compare May 2, 2019 08:23
@pothos
Copy link
Contributor Author

pothos commented May 2, 2019

Thank you for the review and good hints! I have addressed the issues and think it is in a better state now.

Actually I removed the entry in systemd-analyze security because the new logic would mean that the badness is always off. But I took the existance of custom filters into account in the IPAddressDeny test.

I wanted to introduce error handling for cgroup_context_apply (with a definition for fatal errors to not change behavior) and propagate it to unit_realize_cgroup. What do you think? Since it looks like a bigger change, I decided to just work around it with a small test to load the BPF programs in unit_start to fail the start if the paths are not valid.

@pothos pothos changed the title bpf-firewall: support custom BPF programs through IPFilter(Ingress|Egress)= bpf-firewall: custom BPF programs through IP(Ingress|Egress)FilterPath= May 2, 2019
evverx added a commit to evverx/systemd that referenced this pull request May 3, 2019
The idea is to get the tests that are skipped in docker containers
to run in VMs to catch more issues. For example, there is a memory
leak in systemd#12419 that can't be
caugth because test-bpf isn't run there under ASan/UBsan:
```
==11144== HEAP SUMMARY:
==11144==     in use at exit: 16,425 bytes in 5 blocks
==11144==   total heap usage: 7,265 allocs, 7,260 frees, 2,659,529 bytes allocated
==11144==
==11144== 41 (16 direct, 25 indirect) bytes in 1 blocks are definitely lost in loss record 2 of 5
==11144==    at 0x4838748: malloc (vg_replace_malloc.c:308)
==11144==    by 0x483AD63: realloc (vg_replace_malloc.c:836)
==11144==    by 0x49897B4: strv_push (strv.c:398)
==11144==    by 0x4989ADD: strv_consume (strv.c:473)
==11144==    by 0x4989BD4: strv_extend (strv.c:512)
==11144==    by 0x45A71A: config_parse_ip_filter_bpf_progs (load-fragment.c:4501)
==11144==    by 0x41A565: main (test-bpf.c:206)
==11144==
==11144== LEAK SUMMARY:
==11144==    definitely lost: 16 bytes in 1 blocks
==11144==    indirectly lost: 25 bytes in 1 blocks
==11144==      possibly lost: 0 bytes in 0 blocks
==11144==    still reachable: 16,384 bytes in 3 blocks
==11144==         suppressed: 0 bytes in 0 blocks
==11144== Reachable blocks (those to which a pointer was found) are not shown.
```
@pothos pothos force-pushed the ip_filter_bpf_progs branch 4 times, most recently from 9883630 to a302ddb Compare May 3, 2019 14:23
@pothos
Copy link
Contributor Author

pothos commented May 3, 2019

Have addressed the issues (also ASAN), except moving the check from unit_start to unit_prepare_exec as mentioned in the review comment.

@evverx
Copy link
Contributor

evverx commented May 3, 2019

@pothos as it turns out, I was too optimistic about how easy it is to run test-bpf on Travis CI under ASan/UBsan :-) I hope @mrc0mmand will help to bring them to Arch (run on CentOS CI with vagrant) so that it could be covered. In the meantime, I ran the test under Valgrind manually and the memory leak seems to be gone.

By the way, there was an attempt to start using libbpf in #12151. I'm wondering if it would be possible to somehow sync these two PRs. cc @wat-ze-hex @rdna.

@rdna
Copy link

rdna commented May 4, 2019

@evverx this PR could save a few lines of code if libbpf was in systemd, specifically it has both bpf_obj_get and bpf_obj_pin (see https://github.com/libbpf/libbpf/blob/master/src/bpf.c).

Admittedly not too many though so IMO it should not be blocked by #12151 and it'll be easier to follow up later to switch to those libbpf functions if libbpf support is merged.

Copy link
Member

@poettering poettering left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks pretty good, we are coming close!

@pothos pothos force-pushed the ip_filter_bpf_progs branch from a302ddb to 00745c3 Compare May 31, 2019 16:33
@pothos
Copy link
Contributor Author

pothos commented May 31, 2019

Hi,
thanks for the review and sorry for the big delay. I've moved the call and confirmed that systemd-run behaves as expected if given a wrong path, both with and without --scope.

@iaguis iaguis requested a review from poettering June 18, 2019 18:37
Copy link
Member

@poettering poettering left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks excellent! just trivialities. one more round and we should be done!

@pothos pothos force-pushed the ip_filter_bpf_progs branch from 00745c3 to 7078fe2 Compare June 21, 2019 16:53
@pothos
Copy link
Contributor Author

pothos commented Jun 21, 2019

@poettering Have addressed it, thanks for bundling the review with actual code suggestions 👍

Copy link
Member

@poettering poettering left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great, just one buglet

@pothos pothos force-pushed the ip_filter_bpf_progs branch from 7078fe2 to 735220d Compare June 24, 2019 22:09
@pothos
Copy link
Contributor Author

pothos commented Jun 24, 2019

Thanks, done (:

@pothos pothos force-pushed the ip_filter_bpf_progs branch from 735220d to 57d5e18 Compare June 24, 2019 22:15
Takes a single /sys/fs/bpf/pinned_prog string as argument, but may be
specified multiple times. An empty assignment resets all previous filters.

Closes systemd#10227
@pothos pothos force-pushed the ip_filter_bpf_progs branch from 57d5e18 to 8550c24 Compare June 24, 2019 22:18
@poettering poettering merged commit fab3474 into systemd:master Jun 25, 2019
@poettering
Copy link
Member

Excellent! Thanks!

@poettering poettering removed the reviewed/needs-rework 🔨 PR has been reviewed and needs another round of reworks label Jun 25, 2019
@pothos
Copy link
Contributor Author

pothos commented Jun 26, 2019

Actually forgot to add the NEWS entry to the commit 🙈

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Development

Successfully merging this pull request may close these issues.

RFE: BPF programs on cgroups

4 participants