Prevent nodes from updating taints#63167
Conversation
|
Aren't existing conditions expected to be transitioned to taints in which case k'let will place and remove taints? |
They already are, but are translated by the node controller, not set directly by the kubelet (which works really well to allow a limited set of conditions to affect taints without giving the kubelet full control over them)
Because kubelets removing/modifying their own taints removes a primary tool an administrator has for preventing nodes from accepting certain workloads (imagine a "compromised" or "unverified" taint set on a node intended to keep the scheduler from sending workloads to it) See kubernetes/community#911 for a fuller discussion of the issues with letting kubelets steer workloads via self-labeling or self-untainting |
|
cc @kubernetes/sig-auth-pr-reviews @kubernetes/sig-node-pr-reviews @kubernetes/sig-scheduling-pr-reviews |
|
The diff is ok to me :) There's a discussion (#62311) on replacing node condition with taints, will we add white list for node conditions? |
|
Would it impact these scenario (cloud specific taint being applied on the fly/runtime): https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/providers/aws/aws.go#L1777 ? |
No, that is only called from AttachDisk, which has been done by the attach/detach controller, not the kubelet, since 1.3 xref #55517 |
If we want the kubelet to start setting/updating/removing taints on itself directly, we can discuss the pros/cons of whitelisting specific taints at that point. I could see whitelisting the set that is automatically translated from node-reported conditions->taints, but would want that to be an intentional decision. |
That's ok to me, and PR is LGTM; waiting for mikedanese@'s comments if any :) |
|
I would like to restrict this since nothing uses it, and loosen the restriction as needed going forward. @vishh has a reasonable usecase. How should a node agent out of core request evacuation? Does the node agent need an accompanying controller that understands a new node condition? |
I agree.
I could see any of the following (roughly in order of preference):
|
|
cc @kubernetes/sig-cluster-lifecycle-pr-reviews |
|
/lgtm |
|
Let's get this in and relax constraints when we need to. |
486d088 to
15bcfd5
Compare
|
rebased |
|
Download error for bazel build
/retest |
|
/test pull-kubernetes-e2e-kops-aws |
Separate controller manager has race condition issue between condition <-> taints :( , xref #63897 |
|
/lgtm help to re-lgtm :) |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: k82cn, liggitt, mikedanese The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/test all [submit-queue is verifying that this PR is safe to merge] |
|
Automatic merge from submit-queue (batch tested with PRs 63167, 63357). If you want to cherry-pick this change to another branch, please follow the instructions here. |
Prevents kubelets from modifying or removing taints on update.
Nodes can set taints when they register themselves, but do not update/remove those taints after creation (that is done by the node controller based on reported node conditions).
xref kubernetes/community#911 kubernetes/enhancements#279
/sig node
/sig auth
/sig scheduling
/assign @mikedanese @k82cn