feat: Add replicas and minAvailable fields for PodCliquesScalingGroups#116
Conversation
593e522 to
c3e5275
Compare
95422d6 to
b7930f4
Compare
Ronkahn21
left a comment
There was a problem hiding this comment.
I will complete the review later
Ronkahn21
left a comment
There was a problem hiding this comment.
I had some questions regarding the base vs individual pod gang
Do we have documentation why did we decide this flow
There was a problem hiding this comment.
Thanks for the PR @julienmancuso!
1/n review, covers the API, and the webhooks primarily.
Signed-off-by: Julien Mancuso <jmancuso@nvidia.com>
Signed-off-by: Julien Mancuso <jmancuso@nvidia.com>
Signed-off-by: Julien Mancuso <jmancuso@nvidia.com>
Signed-off-by: Julien Mancuso <jmancuso@nvidia.com>
Signed-off-by: Julien Mancuso <jmancuso@nvidia.com>
Signed-off-by: Julien Mancuso <jmancuso@nvidia.com>
Signed-off-by: Julien Mancuso <jmancuso@nvidia.com>
Signed-off-by: Julien Mancuso <jmancuso@nvidia.com>
Signed-off-by: Julien Mancuso <jmancuso@nvidia.com>
Signed-off-by: Julien Mancuso <jmancuso@nvidia.com>
renormalize
left a comment
There was a problem hiding this comment.
3/n. This is about it from me. Sorry for the delayed reviews; the PR is very large after all! Thanks!
Signed-off-by: Julien Mancuso <jmancuso@nvidia.com>
| Replicas *int32 `json:"replicas,omitempty"` | ||
| // MinAvailable specifies the minimum number of ready replicas required for the group to be considered operational. | ||
| // A scaling group replica is considered "ready" when its associated PodClique has sufficient ready Pods | ||
| // (PodClique.Status.ReadyReplicas >= PodClique.Status.MinAvailable), where a Pod is ready when its PodReady condition is True. |
There was a problem hiding this comment.
Please remove this as this has changed.
| // MinAvailable specifies the minimum number of ready replicas required for the group to be considered operational. | ||
| // A scaling group replica is considered "ready" when its associated PodClique has sufficient ready Pods | ||
| // (PodClique.Status.ReadyReplicas >= PodClique.Status.MinAvailable), where a Pod is ready when its PodReady condition is True. | ||
| // If MinAvailable is breached, it will trigger gang-termination of the podGangs. |
There was a problem hiding this comment.
| // If MinAvailable is breached, it will trigger gang-termination of the podGangs. | |
| // If MinAvailable is breached, it will be used to signal that the PodCliqueScalingGroup is no longer operating with the desired availability. |
There was a problem hiding this comment.
will this not trigger gang-termination after the terminationDelay?
Signed-off-by: Julien Mancuso <jmancuso@nvidia.com>
Signed-off-by: Julien Mancuso <jmancuso@nvidia.com>
renormalize
left a comment
There was a problem hiding this comment.
Thanks for addressing all the comments!
unmarshall
left a comment
There was a problem hiding this comment.
Thanks for addressing the review comments
Here is an example for :
it creates these podgangs :

base podgang :
podgang for replica 3 :
podgang for replica 4 :
and follwing pods (replicas pods are in pending state while all the pods of the base podgang are not running) :

and pocliques :
