What you would like to be added?
Convert nine known O(N×M) loops in the Grove operator's reconcile path to O(N+M) by replacing per-element slices.Contains / slices.IndexFunc / lo.Filter lookups with map[string]struct{} (or map[string]T) sets built once per sync context. The patterns are all the same shape: an inner linear scan over a slice that could be a single map lookup.
The nine call sites, ordered by likely impact (Pod-cardinality first, since Pods have the highest cardinality at scale):
| # |
File |
Function |
Current complexity |
Target complexity |
| 1 |
operator/internal/controller/podcliqueset/components/podgang/syncflow.go |
initializeAssignedAndUnassignedPodsForPCS |
O(totalPods × expectedPodGangs) |
O(totalPods + expectedPodGangs) |
| 2 |
operator/internal/controller/podclique/components/pod/syncflow.go |
checkAndRemovePodSchedulingGates |
O(podsInPCLQ × podRefsInPodGangForPCLQ) |
O(podsInPCLQ + podRefsInPodGangForPCLQ) |
| 3 |
operator/internal/controller/podclique/components/pod/rollingupdate.go |
getPodNamesPendingUpdate |
O(oldPods × deletionExpectedPodUIDs) |
O(oldPods + deletionExpectedPodUIDs) |
| 4 |
operator/internal/controller/podcliquescalinggroup/components/podclique/sync.go |
createExpectedPCLQs, createOrUpdatePCLQs |
O(expectedPCLQs × existingPCLQs) |
O(expectedPCLQs + existingPCLQs) |
| 5 |
operator/internal/controller/podcliqueset/components/podclique/podclique.go |
createOrUpdatePCLQs |
O(expectedStandalonePCLQs × existingStandalonePCLQs) |
O(expectedStandalonePCLQs + existingStandalonePCLQs) |
| 6 |
operator/internal/controller/podcliqueset/components/podcliquescalinggroup/podcliquescalinggroup.go |
syncPodCliqueScalingGroups |
O(expectedPCSGs × existingPCSGs) |
O(expectedPCSGs + existingPCSGs) |
| 7 |
operator/internal/controller/podcliqueset/components/podgang/syncflow.go |
getPodsForPodCliquesPendingCreation |
O(podGangPCLQs × existingPCLQs) |
O(podGangPCLQs + existingPCLQs) |
| 8 |
operator/internal/controller/podcliqueset/components/podgang/syncflow.go |
getPodGangNamesPendingCreation, getExcessPodGangNames |
O(expectedPodGangs × existingPodGangs) |
O(expectedPodGangs + existingPodGangs) |
| 9 |
operator/internal/controller/podcliqueset/components/podgang/syncflow.go |
getPodCliques |
O(podGangPCLQs × existingPCLQs) |
O(podGangPCLQs + existingPCLQs) |
The fix shape is identical for all of them: hoist the lookup target into a map[string]struct{} (or map[string]T where the value is needed) once at the start of the sync context, then replace each slices.Contains / slices.IndexFunc with an O(1) map lookup. Where the slice is mutable, prefer slices.DeleteFunc over lo.Filter — it is in-place and stdlib.
A representative example of the same pattern, fixed recently as part of #567, lives in operator/internal/controller/podcliqueset/reconcilestatus.go (computeAvailableAndUpdatedReplicas): see flattenNamesToSet + slices.DeleteFunc for a working template.
Why is this needed?
Grove aims to support almost any Pods amount per PodCliqueSet, distributable in any shape (one PodClique with 10 000 Pods, 5 000 PodCliques with 2 Pods, anything in between). The point of this issue is not that the operator falls over at 10 000 — it is that the quadratic shape of these loops makes the cost balloon long before the upper bound, and the wins from linearizing them are already meaningful at modest scale (~1 000 Pods).
Per-reconcile comparison counts
For two representative fleet shapes, here is the worst per-PCS-reconcile work for the affected hotspots:
| Shape |
N=1 000 Pods |
N=2 000 Pods |
N=10 000 Pods |
| Pod-heavy (1 PCLQ × N Pods) — hits hotspots 2, 3 |
~10⁶ per PCLQ sync |
~4×10⁶ |
~10⁸ |
| PCLQ-heavy (N/2 PCLQs × 2 Pods) — hits hotspots 1, 4, 5, 7, 8, 9 |
each ~2.5×10⁵; aggregate ~1.5×10⁶ |
each ~10⁶; aggregate ~6×10⁶ |
each ~2.5×10⁷; aggregate ~1.5×10⁸ |
A few observations:
- Even at N=1 000 — well below the upper target — the PCLQ-heavy shape produces low-millions of comparisons per reconcile, plus per-element slice allocations (
lo.Map, lo.FilterMap, lambda captures) on every hot path. With multi-second requeue cadences across multiple PCSes, this is already a measurable source of CPU and GC pressure; it just doesn't appear in CI/dev because dev clusters typically run at N≈10.
- N=10 000 is where it gets ugly. The same patterns produce 10⁸-class per-reconcile work — that is the regime where reconcile latencies climb into seconds and watch-event storms cascade across observers.
- The fix gets cheaper as scale grows. Each O(N²) loop becomes an O(N+M) map lookup, taking total work from
~10⁸ to ~10⁴ at N=10 000 — four orders of magnitude — for a one-shot map build per sync context.
What you would like to be added?
Convert nine known O(N×M) loops in the Grove operator's reconcile path to O(N+M) by replacing per-element
slices.Contains/slices.IndexFunc/lo.Filterlookups withmap[string]struct{}(ormap[string]T) sets built once per sync context. The patterns are all the same shape: an inner linear scan over a slice that could be a single map lookup.The nine call sites, ordered by likely impact (Pod-cardinality first, since Pods have the highest cardinality at scale):
operator/internal/controller/podcliqueset/components/podgang/syncflow.goinitializeAssignedAndUnassignedPodsForPCSO(totalPods × expectedPodGangs)O(totalPods + expectedPodGangs)operator/internal/controller/podclique/components/pod/syncflow.gocheckAndRemovePodSchedulingGatesO(podsInPCLQ × podRefsInPodGangForPCLQ)O(podsInPCLQ + podRefsInPodGangForPCLQ)operator/internal/controller/podclique/components/pod/rollingupdate.gogetPodNamesPendingUpdateO(oldPods × deletionExpectedPodUIDs)O(oldPods + deletionExpectedPodUIDs)operator/internal/controller/podcliquescalinggroup/components/podclique/sync.gocreateExpectedPCLQs,createOrUpdatePCLQsO(expectedPCLQs × existingPCLQs)O(expectedPCLQs + existingPCLQs)operator/internal/controller/podcliqueset/components/podclique/podclique.gocreateOrUpdatePCLQsO(expectedStandalonePCLQs × existingStandalonePCLQs)O(expectedStandalonePCLQs + existingStandalonePCLQs)operator/internal/controller/podcliqueset/components/podcliquescalinggroup/podcliquescalinggroup.gosyncPodCliqueScalingGroupsO(expectedPCSGs × existingPCSGs)O(expectedPCSGs + existingPCSGs)operator/internal/controller/podcliqueset/components/podgang/syncflow.gogetPodsForPodCliquesPendingCreationO(podGangPCLQs × existingPCLQs)O(podGangPCLQs + existingPCLQs)operator/internal/controller/podcliqueset/components/podgang/syncflow.gogetPodGangNamesPendingCreation,getExcessPodGangNamesO(expectedPodGangs × existingPodGangs)O(expectedPodGangs + existingPodGangs)operator/internal/controller/podcliqueset/components/podgang/syncflow.gogetPodCliquesO(podGangPCLQs × existingPCLQs)O(podGangPCLQs + existingPCLQs)The fix shape is identical for all of them: hoist the lookup target into a
map[string]struct{}(ormap[string]Twhere the value is needed) once at the start of the sync context, then replace eachslices.Contains/slices.IndexFuncwith an O(1) map lookup. Where the slice is mutable, preferslices.DeleteFuncoverlo.Filter— it is in-place and stdlib.A representative example of the same pattern, fixed recently as part of #567, lives in
operator/internal/controller/podcliqueset/reconcilestatus.go(computeAvailableAndUpdatedReplicas): seeflattenNamesToSet+slices.DeleteFuncfor a working template.Why is this needed?
Grove aims to support almost any Pods amount per PodCliqueSet, distributable in any shape (one PodClique with 10 000 Pods, 5 000 PodCliques with 2 Pods, anything in between). The point of this issue is not that the operator falls over at 10 000 — it is that the quadratic shape of these loops makes the cost balloon long before the upper bound, and the wins from linearizing them are already meaningful at modest scale (~1 000 Pods).
Per-reconcile comparison counts
For two representative fleet shapes, here is the worst per-PCS-reconcile work for the affected hotspots:
~10⁶per PCLQ sync~4×10⁶~10⁸2.5×10⁵; aggregate ~1.5×10⁶10⁶; aggregate ~6×10⁶2.5×10⁷; aggregate ~1.5×10⁸A few observations:
lo.Map,lo.FilterMap, lambda captures) on every hot path. With multi-second requeue cadences across multiple PCSes, this is already a measurable source of CPU and GC pressure; it just doesn't appear in CI/dev because dev clusters typically run at N≈10.~10⁸to~10⁴at N=10 000 — four orders of magnitude — for a one-shotmapbuild per sync context.