Respect PodTopologySpread after rolling upgrades#111441
Respect PodTopologySpread after rolling upgrades#111441k8s-ci-robot merged 3 commits intokubernetes:masterfrom
Conversation
|
/sig scheduling |
6b1008e to
4283ef2
Compare
4283ef2 to
87a2954
Compare
87a2954 to
bdfffb7
Compare
|
/assign @ahg-g |
2584465 to
fdd8ad4
Compare
| MatchLabelKeys: []string{"/simple"}, | ||
| }, | ||
| }, | ||
| wantFieldErrors: []*field.Error{field.Invalid(fieldPathMatchLabelKeys, "/simple", "prefix part must be non-empty")}, |
There was a problem hiding this comment.
prefix part must be non-empty this error msg is hard to understand at the first glance until I checked the source which split the string by '/'.
you may want to add another test string, such as an empty string, I am fine with this as is though.
|
looks good to me @liggitt this is ready. |
| allErrs = append(allErrs, err) | ||
| } | ||
| allErrs = append(allErrs, validateMatchLabelKeys(subFldPath.Child("matchLabelKeys"), constraint.MatchLabelKeys, constraint.LabelSelector)...) | ||
| } |
There was a problem hiding this comment.
pre-existing, but I noticed validateTopologySpreadConstraints doesn't seem to validate constraint.LabelSelector at all... @ahg-g, can you open a separate issue to track that (need to make sure the scheduler is robust against completely invalid selectors, and consider how to fix this validation in the least disruptive way possible)
There was a problem hiding this comment.
Some update on this comments, the validation is actually done when the labels.Selector is built. e.g.
And it is validated here,
kubernetes/staging/src/k8s.io/apimachinery/pkg/labels/selector.go
Lines 191 to 194 in 891cbed
I think the original comment is still valid and the the validation mentioned above would mitigate the problem as the invalid label value won't be accepted at all.
There was a problem hiding this comment.
that happens at time of use, not time of API write, which is not ideal (since the user has no idea the object they created is not working like they want it to)
|
/milestone v1.25 |
|
/approve /hold looks like commits need squashing |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: denkensk, liggitt The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Signed-off-by: Alex Wang <wangqingcan1990@gmail.com>
Signed-off-by: Alex Wang <wangqingcan1990@gmail.com>
Signed-off-by: Alex Wang <wangqingcan1990@gmail.com>
6cdf9bd to
86a2a85
Compare
|
/lgtm |
|
/triage accepted |
What type of PR is this?
/kind feature
/kind api-change
What this PR does / why we need it:
PodTopologySpread is widely used in production environments, especially in service type workloads which employ Deployments. However, currently it has a limitation that manifests during rolling updates which causes the deployment to end up out of balance (98215, 105661,k8s-pod-topology spread is not respected after rollout).
The root cause is that PodTopologySpread constraints allow defining a key-value label selector, which applies to all pods in a Deployment irrespective of their owning ReplicaSet. As a result, when a new revision is rolled out, spreading will apply across pods from both the old and new ReplicaSets, and so by the time the new ReplicaSet is completely rolled out and the old one is rolled back, the actual spreading we are left with may not match expectations because the deleted pods from the older ReplicaSet will cause skewed distribution for the remaining pods.
Which issue(s) this PR fixes:
Fixes #98215
Fixes #105661
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: