Host Policies can break kube if applied incorrectly. #12289
Host Policies can break kube if applied incorrectly. #12289aanm merged 1 commit intocilium:masterfrom
Conversation
This comment has been minimized.
This comment has been minimized.
qmonnet
left a comment
There was a problem hiding this comment.
Looks like a decent option to me.
|
@jedsalazar By the way, could you remove my GitHub handle from the commit message if you get a chance? Not that I have anything against it, but I will then receive notifications from GitHub anytime this commit is pulled somewhere. Quentin did that once and I ended up having notifications for random Cilium forks, folks rebasing their pull requests, etc. 😄 |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Was this intending to replace the 22 port instead of 8472?
There was a problem hiding this comment.
Do we want to drop port 22? I thought that was one of the few pieces that didn't completely brick the nodes in these cases where incorrect policy was applied; at least someone can still SSH into the node & unload BPF filters if necessary.
There was a problem hiding this comment.
Maybe it's a good idea not to remove any of the existing ports, but instead add one extra that can be removed? I'm not sure what the implications are for if we drop 8472 whether this makes it more likely that users will deploy policies that might block tunnel traffic.
There was a problem hiding this comment.
this might have been accidentally removed?
There was a problem hiding this comment.
What if we replace this policy with the following?
apiVersion: "cilium.io/v2"
kind: CiliumClusterwideNetworkPolicy
description: "Allow a minimum set of required ports on ingress of worker nodes"
metadata:
name: "lock-down-ingress-worker-node"
spec:
nodeSelector:
matchLabels:
type: worker
ingress:
- fromEntities:
- world
- toPorts:
- ports:
- port: "6443"
protocol: TCP
- port: "2379"
protocol: TCP
- port: "4240"
protocol: TCP
- port: "8472"
protocol: UDP
- port: "PLEASE_ADD_PORTS_RELEVANT_TO_YOUR_ENV"
protocol: ANYThat would ensure external access is never broken (unless bug) while forcing the user to take a look and add relevant ports.
There was a problem hiding this comment.
that's what I've been trying to say since the beginning 😢. However, the fromEntities does not seem correct in this case. The nodes won't be able to connect to kube-apiserver since the policy is only allowing world and nodes are not world
There was a problem hiding this comment.
sorry for my misunderstanding @aanm. I think that's a great approach. for the fromEntities, do you think we should replace that with remote-node or cluster?
There was a problem hiding this comment.
sorry for my misunderstanding @aanm. I think that's a great approach. for the
fromEntities, do you think we should replace that withremote-nodeorcluster?
@jedsalazar I think it shouldn't be either. mostly because new nodes won't have an identity assigned, and therefore none of the entity will work (well world will probably work but I believe that's not our goal)
There was a problem hiding this comment.
this might have been accidentally removed?
This comment has been minimized.
This comment has been minimized.
4 similar comments
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
b7f1331 to
cb2d41e
Compare
cb2d41e to
badfb26
Compare
Host Policies can break kube if applied incorrectly. If a user unintentionally applies an overly aggressive host-level netpolicy it will break access to their nodes and break kube. Given the blast radius of a mistake and the undocumented fix, I discussed an initial solution in the docs with @pchaigno and we agreed that a good start is preventing an exmaple policy from loading until the user changes example values. This has the benefit of explicitly requiring some thought from users before applying a policy that could lead to a k8s outage. Happy to discuss alternative ideas here but the goal is to protect operators from breaking their environments while testing this feature. Signed-off-by: Jed Salazar <jed@isovalent.com>
fc31435 to
a8a84b0
Compare
If a user unintentionally applies an overly aggressive host-level netpolicy it
will break access to their nodes and break Kube. Given the blast radius of a
mistake and the undocumented fix, I discussed an initial solution in the docs
with Paul Chaignon and we agreed that a good start is preventing an example
policy from loading until the user changes example values.
This has the benefit of explicitly requiring some thought from users before
applying a policy that could lead to a k8s outage.
Happy to discuss alternative ideas here but the goal is to protect operators
from breaking their environments while testing this feature.
Signed-off-by: Jed Salazar jed@isovalent.com