bound UpdateProgress status payload#576
Merged
oleg-kushniriov merged 7 commits intoMay 11, 2026
Merged
Conversation
danbar2
reviewed
May 5, 2026
shayasoolin
reviewed
May 5, 2026
gflarity
reviewed
May 5, 2026
gflarity
left a comment
Contributor
There was a problem hiding this comment.
One nit, otherwise LGTM.
shayasoolin
previously approved these changes
May 7, 2026
gflarity
reviewed
May 9, 2026
gflarity
left a comment
Contributor
There was a problem hiding this comment.
Missed the counts being added to deprecated status fields (via structs). Let's chat about this briefly during the standup or design sync. Maybe I'm missing something.
94c7cdc to
7b3d718
Compare
danbar2
previously approved these changes
May 10, 2026
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7b3d718 to
f3f36f6
Compare
shayasoolin
reviewed
May 10, 2026
shayasoolin
approved these changes
May 11, 2026
danbar2
approved these changes
May 11, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What type of PR is this?
/kind bug
/kind api
What this PR does / why we need it:
The
UpdateProgress.UpdatedPodCliquesandUpdateProgress.UpdatedPodCliqueScalingGroupsfields onPodCliqueSet/PodCliqueScalingGroupstatus were unbounded[]stringslices that grew linearly with the total number of PodCliques/PCSGs in the cluster. At the scale Grovetargets, this produced two failure modes well before the etcd 1.5 MiB per-object limit:
kubectl watch). At ~1 000 PodCliques the payload was ~120 KiB; at ~13 000 it would exceed etcd's per-object limit. The deprecatedRollingUpdateProgressmirrordoubled the cost.
rollingupdate.godidlo.Uniq(append(...))followed byslices.DeleteFuncover the expected-FQN list every reconcile. A second O(M·E) pattern inreconcileStatusre-flattenedlo.Values(map)per filtered element.This PR replaces the unbounded slices with bounded
int32count + total fields:PodCliqueSetUpdateProgress:UpdatedPodCliquesCount,TotalPodCliquesCount,UpdatedPodCliqueScalingGroupsCount,TotalPodCliqueScalingGroupsCount.PodCliqueScalingGroupUpdateProgress:UpdatedPodCliquesCount,TotalPodCliquesCount.Counts are recomputed each reconcile from child
CurrentPodCliqueSetGenerationHashlabels already in the informer cache — idempotent by construction (no drift on scale-in, requeue, status-write retry, or controller restart) and O(N) instead of O(N²). The stray-resource filter ishoisted to a
map[string]struct{}lookup withslices.DeleteFunc, making it O(M+E) instead of O(M·E).The deprecated
RollingUpdateProgressmirror is preserved with the new bounded shape so existing consumers keep working at the field-presence level. Slice sub-fields are gone from the mirror as well.Adds four
kubectl get pcsprinter columns and twokubectl get pcsgprinter columns surfacing the new counts.2. Reconciler O(N²) hot path. The cleanup block in
rollingupdate.godidlo.Uniq(append(...))followed byslices.DeleteFuncover the expected-FQN list every reconcile. A second O(M·E) pattern inreconcileStatusre-flattenedlo.Values(map)per filtered element.This PR replaces the unbounded slices with bounded
int32count + total fields:PodCliqueSetUpdateProgress:UpdatedPodCliquesCount,TotalPodCliquesCount,UpdatedPodCliqueScalingGroupsCount,TotalPodCliqueScalingGroupsCount.PodCliqueScalingGroupUpdateProgress:UpdatedPodCliquesCount,TotalPodCliquesCount.Counts are recomputed each reconcile from child
CurrentPodCliqueSetGenerationHashlabels already in the informer cache — idempotent by construction (no drift on scale-in, requeue, status-write retry, or controller restart) and O(N) instead of O(N²). The stray-resource filter ishoisted to a
map[string]struct{}lookup withslices.DeleteFunc, making it O(M+E) instead of O(M·E).The deprecated
RollingUpdateProgressmirror is preserved with the new bounded shape so existing consumers keep working at the field-presence level. Slice sub-fields are gone from the mirror as well.Adds four
kubectl get pcsprinter columns and twokubectl get pcsgprinter columns surfacing the new counts.Which issue(s) this PR fixes:
Fixes #567
Special notes for your reviewer:
pcs.Status.UpdateProgress.UpdatedPodCliques(or the deprecated mirror's slice copy) will now find the field absent. The replacement isUpdatedPodCliquesCount/TotalPodCliquesCount. Justified atv0.1.0-alpha.8;called out in release notes.
pri.pclqswas always loaded withKindPodCliqueSetowner). PCSG-owned PCLQs are tracked on their owning PCSG viaPodCliqueScalingGroupUpdateProgress.UpdatedPodCliquesCount. We discussed during review whether to aggregate at the PCS level; current implementation keeps existing semantics.RollingUpdateProgressis not removed (already deprecated; deprecation contract honored). Its inner slice fields are gone but timestamps +CurrentlyUpdating+ the new counts are mirrored. Consumers iterating the old slices will seenil/empty and shouldmigrate.
computeAvailableAndUpdatedReplicascount outputs,mutateReplicaswrite paths (PCS + PCSG), and the count helpers (countUpdatedPCLQs,countUpdatedPCSGs,countPCSGReplicaUpdatedPCLQs,flattenNamesToSet). All target the new canonicalfields, not the deprecated mirror.
zz_generated.deepcopy.go, both CRD YAMLs (api/core/v1alpha1/crds/+charts/crds/), anddocs/api-reference/operator-api.mdare regenerated viamake generateandhack/generate-api-docs.sh.make lintandgo vet(default +-tags=e2e) clean. Full unit-test suite green except for one pre-existingTestDecodeOperatorConfigfailure inapi/config/v1alpha1that is unrelated to this change and reproduces onmain.Does this PR introduce a API change?