policy: return ICMP "Destination unreachable" on ipv4 ingress policy denials#42068
policy: return ICMP "Destination unreachable" on ipv4 ingress policy denials#42068antonipp wants to merge 1 commit intocilium:mainfrom
Conversation
128984c to
1ba7880
Compare
|
/test |
julianwiedmann
left a comment
There was a problem hiding this comment.
(Unfortunately I don't have a quick way to set up a real cluster with tunneling rn)
Have you tried using the kind-based development setup?
bpf/bpf_lxc.c
Outdated
| #if defined(TUNNEL_MODE) | ||
| if (from_tunnel) { | ||
| struct remote_endpoint_info *info; | ||
| struct trace_ctx trace = { | ||
| .reason = TRACE_REASON_POLICY, | ||
| .monitor = TRACE_PAYLOAD_LEN, | ||
| }; | ||
|
|
||
| info = lookup_ip4_remote_endpoint(ip4->daddr, 0); | ||
| if (info && info->flag_has_tunnel_ep) { | ||
| ret = encap_and_redirect_lxc(ctx, info, SECLABEL_IPV4, | ||
| info->sec_identity, &trace, | ||
| bpf_htons(ETH_P_IP)); | ||
| if (ret == CTX_ACT_REDIRECT) { | ||
| update_metrics(ctx_full_len(ctx), metric_dir, | ||
| __DROP_REASON(verdict)); | ||
| return ret; | ||
| } | ||
| if (IS_ERR(ret)) | ||
| goto drop_err; | ||
| if (!revalidate_data(ctx, &data, &data_end, &ip4)) { | ||
| ret = DROP_INVALID; | ||
| goto drop_err; | ||
| } | ||
| } | ||
| } | ||
| #endif /* TUNNEL_MODE */ | ||
|
|
||
| if (is_defined(ENABLE_HOST_ROUTING)) { | ||
| ret = fib_redirect_v4(ctx, ETH_HLEN, ip4, false, false, &ext_err, &oif); | ||
| if (fib_ok(ret)) { | ||
| update_metrics(ctx_full_len(ctx), metric_dir, | ||
| __DROP_REASON(verdict)); | ||
| return ret; | ||
| } | ||
| } else { | ||
| ret = redirect_self(ctx); | ||
| if (!IS_ERR(ret)) { | ||
| update_metrics(ctx_full_len(ctx), metric_dir, | ||
| __DROP_REASON(verdict)); | ||
| return ret; | ||
| } |
There was a problem hiding this comment.
For reference - I'd expect us to follow this path here 1-to-1, as if the ICMP was genuinely sent out by the pod and needs forwarding to its destination. To the point where this might have to be a shared helper.
Unless there's cases in there which don't apply for an ICMP reply?
There was a problem hiding this comment.
Yeah, you are right, it was a bit weird re-creating this logic. I refactored the code to extract it to a shared helper function ipv4_forward_to_destination which is now called in both places. I also set up kind and tested the tunneling scenario (updated the PR description accordingly)
1ccf465 to
84836e6
Compare
|
/test |
This expands the functionality enabled by --policy-deny-response=icmp from egress-only to egress and ingress. Now ICMP responses can be sent for ingress policy denials as well. Signed-off-by: Anton Ippolitov <anton.ippolitov@datadoghq.com>
84836e6 to
48d74f0
Compare
|
/test |
julianwiedmann
left a comment
There was a problem hiding this comment.
Thank you! Had a quick first look to find my bearings, see some comments below.
Noting down the full(?) matrix of possible packet paths:
- arrival via {to-container, tailcall program}
- origin is {self, local pod, local host, local proxy, remote pod/host, world}
For the local delivery paths, we'll need to have a good eye on what expectations redirect_ep() has, to determine whether it can use bpf_redirect_peer().
| ret = redirect_self(ctx); | ||
|
|
||
| if (!IS_ERR(ret)) { | ||
| update_metrics(ctx_full_len(ctx), METRIC_EGRESS, __DROP_REASON(verdict)); | ||
| info = lookup_ip4_remote_endpoint(ip4->daddr, 0); |
There was a problem hiding this comment.
Ah, interesting that we can also use ipv4_forward_to_destination() for the egress-deny path! I'd suggest to do that in a follow-up PR though, so we can evaluate it on its own.
| __u8 metric_dir = (__u8)ctx_load_meta(ctx, CB_POLICY_DENY_DIR); | ||
| __u32 src_label = metric_dir == METRIC_EGRESS ? | ||
| SECLABEL_IPV4 : ctx_load_meta(ctx, CB_POLICY_DENY_SRC_ID); | ||
| __u32 dst_sec_identity = WORLD_IPV4_ID; |
There was a problem hiding this comment.
Hm we previously applied network policy for the packet, so we must have already determined the peer's identity. Ideally we're re-using that identity here, and not try to resolve it again.
That allows us to be consistent, and handles cases where the actual identity is different from what's in the ipcache (relevant for eg. proxy connections).
| ret = ipv4_forward_to_destination(ctx, ip4, NULL, &dst_sec_identity, | ||
| &ext_err, &ct_state_local, CT_NEW, | ||
| info, &trace, false, false, false, 0, 0); |
There was a problem hiding this comment.
Oh it's going to be fun figuring out all the little corner-cases from passing "default" values here :)
| ipv4_forward_to_destination(struct __ctx_buff *ctx, | ||
| struct iphdr *ip4, | ||
| struct ipv4_ct_tuple *tuple, | ||
| const __u32 *dst_sec_identity, |
There was a problem hiding this comment.
The dst_sec_identity can be a regular __u32 now I think, we're not writing to it from this function.
| static __always_inline int handle_ipv4_from_lxc(struct __ctx_buff *ctx, __u32 *dst_sec_identity, | ||
| __s8 *ext_err) | ||
| static __always_inline int | ||
| ipv4_forward_to_destination(struct __ctx_buff *ctx, | ||
| struct iphdr *ip4, |
There was a problem hiding this comment.
Please split this big patch into (at least) two parts, so it's easier to review (and partially revert if needed):
- pure refactor, only splitting off the
ipv4_forward_to_destination()logic. - adding all the new pieces for the ingress-deny ICMP message.
|
This pull request has been automatically marked as stale because it |
|
This pull request has been automatically marked as stale because it |
|
Closing this one, since I handed over my work on this to Alex #44184 |
Description
Follow-up to #41406. See #41859 for the full list of TODOs.
This PR adds the ICMP response functionality to ingress policy denials.
The functionality works for local and remote endpoints with/without tunneling and with/without eBPF Host Routing.
Testing
pingwas returning "Packet filtered":kindand ran the connectivity test suite: