Add support for L3-dependent L4 policies.#1599
Conversation
There was a problem hiding this comment.
if block ends with a return statement, so drop this else and outdent its block
There was a problem hiding this comment.
comment on exported method L4Policy.IngressCoversContext should be of the form "IngressCoversContext ..."
There was a problem hiding this comment.
receiver name filter should be consistent with previous receiver name l4 for L4Filter
There was a problem hiding this comment.
exported method PolicyMap.DeleteL4 should have comment or be unexported
There was a problem hiding this comment.
exported method PolicyMap.L4Exists should have comment or be unexported
There was a problem hiding this comment.
exported method PolicyMap.AllowL4 should have comment or be unexported
There was a problem hiding this comment.
if block ends with a return statement, so drop this else and outdent its block
There was a problem hiding this comment.
if block ends with a return statement, so drop this else and outdent its block
There was a problem hiding this comment.
comment on exported method L4Policy.IngressCoversContext should be of the form "IngressCoversContext ..."
There was a problem hiding this comment.
receiver name filter should be consistent with previous receiver name l4 for L4Filter
There was a problem hiding this comment.
exported method PolicyMap.DeleteL4 should have comment or be unexported
There was a problem hiding this comment.
exported method PolicyMap.L4Exists should have comment or be unexported
There was a problem hiding this comment.
exported method PolicyMap.AllowL4 should have comment or be unexported
There was a problem hiding this comment.
if block ends with a return statement, so drop this else and outdent its block
There was a problem hiding this comment.
if block ends with a return statement, so drop this else and outdent its block
There was a problem hiding this comment.
comment on exported method L4Policy.IngressCoversContext should be of the form "IngressCoversContext ..."
There was a problem hiding this comment.
receiver name filter should be consistent with previous receiver name l4 for L4Filter
There was a problem hiding this comment.
exported method PolicyMap.DeleteL4 should have comment or be unexported
There was a problem hiding this comment.
exported method PolicyMap.L4Exists should have comment or be unexported
There was a problem hiding this comment.
exported method PolicyMap.AllowL4 should have comment or be unexported
There was a problem hiding this comment.
if block ends with a return statement, so drop this else and outdent its block
|
Looks like it's breaking the proxy runtime tests. Investigating. |
9fa5fa7 to
05b8836
Compare
216052c to
0a4177b
Compare
|
CI passed, but actually the kubernetes runtime tests are still failing with the current branch. PR #1624 is tracking the fix for that. Still looking for review, though. |
0a4177b to
32ae8a6
Compare
tgraf
left a comment
There was a problem hiding this comment.
This is amazing work. I spotted no major flaws. The main question is on the intermediate commits which change tests and docs.
There was a problem hiding this comment.
Can we just drop this commit? I think you will undo this anyway in a later commit.
There was a problem hiding this comment.
Are you undoing all of these later on again? I did this commit to stop showing L3/L4 combined policies in the guides when we don't support them. We never merged it though. Given that we now support it, this commit is no longer required. So we can either just skip it here and rebase or undo it at the end.
There was a problem hiding this comment.
Let me look at skipping it here and see whether anything falls out.
joestringer
left a comment
There was a problem hiding this comment.
I posted a few questions that I have for reviewers of this patchset.
There was a problem hiding this comment.
Should we consider tracing either the positive / negative match cases for L4 policy?
There was a problem hiding this comment.
I think what you have is fine to get started.
There was a problem hiding this comment.
This comment can be dropped.
There was a problem hiding this comment.
@tgraf I wanted to double-check, if we have a policy like:
[{
"endpointSelector": {"matchLabels":{"id.bar":""}},
"ingress": [{
"fromEndpoints": [
{"matchLabels":{"reserved:host":""}},
{"matchLabels":{"id.foo":""}
}]
}]
}]
Then this means that we should apply the policy at id.bar that the fromEndpoint must include either the label reserved:host or id.foo. It does not require both labels. Correct?
There was a problem hiding this comment.
Correct, fromEndpoints is a slice and each member is an allowed combination of labels. matchLabels can include multiple labels if more than label must match.
There was a problem hiding this comment.
Any preferences on how this trace log should indicate that the label stage cannot select this rule because it includes L4 requirements? In practice I think this will end up with the trace output stating later on that "Label verdict: undecided", but I don't know if the above line clearly describes why this decision can't be made yet.
Related: #1610
There was a problem hiding this comment.
maybe specify how you cannot make a purely L3 / label-specific decision whether policy allows or denies? E.g.:
L3 traffic restricted to specific L4 destinations; cannot determine whether policy allows traffic purely based off of L3
There was a problem hiding this comment.
How about this, a little more terse:
Rule restricts traffic to specific L4 destinations; deferring policy decision to L4 policy stage
32ae8a6 to
5644815
Compare
Dropped intermediate docs/tests change
There was a problem hiding this comment.
Why isn't it this logic: iRule.From != nil && len(iRule.From) > 0?
There was a problem hiding this comment.
Good catch, will fix.
There was a problem hiding this comment.
The key size is not the same, what are the consequences for running instances in case we upgrade?
There was a problem hiding this comment.
Looks not great right now. I believe that after the new Cilium instance starts, it will attempt to regenerate the BPF (including PolicyMap) for existing endpoints, which will call down to OpenOrCreateMap(). If that latter function finds a map in the correct path, it won't validate that the map parameters are the same, so the existing map with a smaller valueSize will remain. Subsequently, when attempting to load the BPF program into the kernel, verification will fail due to access beyond the valueSize of the map. That will leave the container in a bad state.
I think that OpenOrCreateMap() should validate that all of the passed parameters are the same for the existing map, or delete it and attempt to create a new map to mitigate the above issue.
There was a problem hiding this comment.
Looks like there's already some validation in an alternative, OpenOrCreate() but not all of the maps are using this. Should be a minor extra patch.
There was a problem hiding this comment.
Commit 7c8f502 attempts to address this, would be keen to get some feedback on that.
There was a problem hiding this comment.
use the GetExtendedKeyFrom: matchLabels[labels.GetExtendedKeyFrom(k)] = v.Value
There was a problem hiding this comment.
use the GetExtendedKeyFrom: matchLabels[labels.GetExtendedKeyFrom(k)] = v.Value
There was a problem hiding this comment.
What's the difference between this line and line 80?
There was a problem hiding this comment.
Line 80 is saying that if the policy as resolved so far has an empty "FromEndpoints" list then it already covers the endpoints specified by the L4 filter that is currently being processed. Nil FromEndpoints would mean any endpoint can be used.
This line 85 is stating that if there are currently constraints on which endpoints this rule must receive traffic from in the policy, but the L4 Filter being processed allows contact from any Endpoint, then we need to loosen the current policy's restrictions on endpoints.
There was a problem hiding this comment.
I believe so, here's the commit which updates vendor.conf and imports the code:
1ae6a2c
There was a problem hiding this comment.
Can you add a comment why you are doing this?
|
Some updates:
|
There was a problem hiding this comment.
At the bottom of the NEWS.rst file, add the links that these PR's refer to.
There was a problem hiding this comment.
Ah, I thought that was automagic. OK.
There was a problem hiding this comment.
Ah, I thought that was automagic. OK.
There was a problem hiding this comment.
I have a vim script to this automatically for links that are not complete yet.
There was a problem hiding this comment.
... FromRequires field; the effects ...
There was a problem hiding this comment.
"this" is vague for a log message; which L4 Policy are we skipping? The log message should reflect which policy is being skipped.
There was a problem hiding this comment.
I'll update it to print the policy and the fromEndpoints.
There was a problem hiding this comment.
Ah, I'll actually drop this change out from this series because it's unrelated, but I do plan on following up on this separately.
There was a problem hiding this comment.
Yeah, I'll drop the patch that deleted this when I push my latest changes.
There was a problem hiding this comment.
This was an extra testing patch; I'll drop it.
e17e132 to
7b289ad
Compare
00ea5fb to
62a1e18
Compare
There was a problem hiding this comment.
Along with this change to the tracing output, I think we should add documentation to this file explaining L3-dependent L4 policy as well. I don't mind this being done after this patch is merged in the interest of having you not have to rebase again :)
There was a problem hiding this comment.
I'll take note to follow up on this. I think that the way that the documentation is written right now, it already implies the behaviour specified with this l3-dependent l4 work. I just don't think that we have any examples using multiple ingress rules, so we could highlight the behaviour difference. This could help people who end up in a situation where they're allowing L3 and allowing L4, but actually intended to only specify l3-dependent l4 policy.
There was a problem hiding this comment.
Great, sounds good! Please create a doc ticket for that so we have it tracked.
There was a problem hiding this comment.
Is there a GitHub issue for this 'in the future' task? Please create one if it does not exist.
There was a problem hiding this comment.
This is just something to note - how will we handle these types of 'collisions' in the future as more and more protocols are supported?
There was a problem hiding this comment.
Definitely something we should look at addressing by the time we consider supporting a third protocol.
There was a problem hiding this comment.
maybe specify how you cannot make a purely L3 / label-specific decision whether policy allows or denies? E.g.:
L3 traffic restricted to specific L4 destinations; cannot determine whether policy allows traffic purely based off of L3
There was a problem hiding this comment.
+1 for factoring stuff out into constants.
There was a problem hiding this comment.
I don't think you need to make this a function of Daemon. Let's just keep it private. Longer term I want to get rid of the Daemon struct
Previously, to perform L4 policy tracing through the Cilium policy, the user had to specify the "-d" flag. However, if for instance someone just wants to perform L4 policy tracing purely using k8s tags, this doesn't make sense. Loosen this requirement. Signed-off-by: Joe Stringer <joe@covalent.io>
Put the timeouts in a convenient one-stop location to update to speed up local testing. This also makes it easier to consider reducing these counts in future to speed up CI runs. This commit shortens the "drop" timeouts to 10s, as well. This is still long compared to the CT tests' 2s DROP_TIMEOUT, but it's closer to the current 14s. Signed-off-by: Joe Stringer <joe@covalent.io>
The GSG test was running the L7 portion of the test with both the L3/L4 policy, as well as the L3/L4/L7 policy, which could lead to unexpected behaviour. Fix it. Signed-off-by: Joe Stringer <joe@covalent.io>
No functional change. Signed-off-by: Joe Stringer <joe@covalent.io>
Always set the protocol to lower case when interpreting it from a k8s policy. Signed-off-by: Joe Stringer <joe@covalent.io>
When the PolicyRepository's AllowsRLocked() function is called, based on the naming convention and documentation we'd expect to determine that, for the entire SearchContext provided, we would receive a complete Label/L3/L4 decision based on the full policy. However, so far it has only represented the Label portion of the policy. The EndPoint package uses this to populate the BPF maps for label access, so split the label portion into a separate function and reuse it there. Then, make AllowsRLocked() great again by expanding its scope to include L4 policy resolution. Signed-off-by: Joe Stringer <joe@covalent.io>
Signed-off-by: Amre Shakimov <amre@covalent.io>
This commit extends the L4Filter to include a list of EndpointSelectors, which enables a future commit to narrow down which endpoints the L4 policy will ultimately apply to. For now, the existing paths will follow the paths with the WildcardEndpointSelector, which allows an L4Filter to apply to all traffic (as the L4-only policies already do today). Signed-off-by: Amre Shakimov <amre@covalent.io> Co-authored-by: Joe Stringer <joe@covalent.io> Signed-off-by: Joe Stringer <joe@covalent.io>
This simple test just checks that when specifying L7 rules during L4Filter creation, regardless of whether it's ingress or egress, and whether there are endpoints specified or not, the L4Filter ends up with an L7 rule. Signed-off-by: Joe Stringer <joe@covalent.io>
- Reject k8s NetworkPolicy rules with combined L3/L4 policy rules. - Don't generate api.Rule with with combined L3/L4 policy rules. Signed-off-by: Thomas Graf <thomas@cilium.io>
Reserve the right to support L4/L7 policies which depend on L3 matching in the future when the datapath supports this. For now, reject any such rules but document that this will be supported in the future. Signed-off-by: Thomas Graf <thomas@cilium.io> Co-authored-by: Joe Stringer <joe@covalent.io> Signed-off-by: Joe Stringer <joe@covalent.io>
If HAVE_L4_POLICY is set, the policy map will be looked up with the destination port and protocol set. Allows to establish L4 policies which are tied to a particular source identity. Signed-off-by: Thomas Graf <thomas@cilium.io> Co-authored-by: Joe Stringer <joe@covalent.io> Signed-off-by: Joe Stringer <joe@covalent.io>
The existing L3 (label-based) policy resolution functions such as AllowsLabelAccess(ctx) and CanReachRLocked(ctx) will now check the target rules for whether the rule also applies L4 constraints, and in these cases return an api.Undecided value. The L4PolicyMap now includes both regular L4-only L4Filters, which do not apply any constraints on the source endpoints, as well as the new L3+L4 combined filters. The existing L4-only test functions will return failure if they find an L4Filter that constrains which endpoints must originate the traffic. Repository.AllowsRLocked() now performs first, a label-based lookup which may permit traffic regardless of L4 rules. Secondly, it runs the L4Egress and L4Ingress policy lookups, which for Egress perfoms an L4-only lookup, whereas for Ingress performs the L3+L4 combined lookup, which may also explicitly Allow or Deny traffic. If either of these L4 policies deny the traffic, the traffic is denied. Otherwise, if at least one of the policies explicitly allows the traffic, then the traffic will overall be allowed. Signed-off-by: Joe Stringer <joe@covalent.io>
Add an extra test to ensure that the example 4 (union of two rules) policy is being correctly applied as a union, rather than intersection. Signed-off-by: Joe Stringer <joe@covalent.io>
Currently, the main daemon will typically attempt to rebuild the base BPF programs before it performs any checks of the existing maps. Unfortunately, that means that if we ever change attributes of a BPF map, then we can easily break in-place upgrade: - Existing Cilium is running with an old map layout - Containers are currently attached with policy enforcement enabled - New version of Cilium with different map layout is installed - Cilium service is restarted - On startup, Cilium calls into bpf/init.sh to compile the base programs - init.sh calls tc lo load the tunnel BPF program with the new map layout - The tc commandline validates the map info and sees that the layouts are different from the existing pinned map, and returns an error - This error is propagated back up to the Cilium startup function - Cilium reports the error and exits. This patch attempts to address the above failure by performing a map validation step during daemon initialization, before the call to rebuild the base BPF programs. It iterates through the maps in the filesystem and if any Cilium PolicyMaps are pinned with attributes that differ from the current attributes, it deletes these maps. Signed-off-by: Joe Stringer <joe@covalent.io>
f29adae to
c32b1f3
Compare
Add some examples for L3-dependent L4 policy as merged in cilium#1599. Fixes: cilium#1685 Signed-off-by: Joe Stringer <joe@covalent.io>
When specifying policies for ingress, users may provide
FromEndpointsorToPorts. These apply constraints on the access between endpoints based upon source labels (at L3) or destination ports (at L4). Up until this point, if a single rule specified bothFromEndpointsandToPorts, the policy would allow the union of these constraints rather than the intersection. For example, if you specified a single rule that requiredFromEndpoints:k8s:id=frontendas well asToPorts:80, then any traffic fromk8s:id=frontendwould be allowed regardless of ports, and any traffic destined to port 80 would be allowed regardless of the source labels.This PR fixes the behaviour to now apply the intersection: traffic must match all of the FromEndpoints constraints AND the ToPorts constraints to be allowed. Within Cilium, this means updating the Label-based policy lookup to return 'api.Undecided' when it finds a rule that specifies L4 constraints; updating the L4 lookup to also be smart enough to apply Label constraints; and ensuring that these policies can be correctly reflected down to the BPF maps. This last part is performed by extending the BPF PolicyMap to also perform lookups using L4 information. Cilium will expand out L4 policies into {SecurityID,port,protocol} tuples into this map.
In BPF, the PolicyMap may now hold entries which specify Ports and Protocols in addition to the SourceIdentities that have been used so far. The datapath logic first uses the source identity plus the packet's L4 information (port,protocol) to perform a lookup. If this matches, because for instance there is an L4-only policy applied and the daemon has pushed the policy down, or if it matches an L4-dependent policy which was pushed down, then traffic will be allowed. If this initial lookup fails, then the datapath will perform a second lookup in the PolicyMap, this time without L4 information. This is the regular L3/Label-based lookup path that already existed. Note that the new L4 or L3+L4 combined lookup only occurs during connection tracking entry creation, so the extra lookup cost should be bound to the first packet of each connection.
Fixes: #789
Fixes: #1064
Fixes: #1217
Fixes: #1496
Future work: