Enhance PGS and PCLQ reconcilers to support PodGang lifecycle management#95
Merged
Conversation
* Partially implements Pod component. With this commit pods without scheduling gates can be created. * Enables Pod component in the PodClique reconciler. * Fixed HPA component which now correctly sets the target resource ref. * Fixed service component, which now creates a headless service. * Introduced some convenient functions. * Changed the pgs-replica-index label key as it was not as per allowed conventions. Signed-off-by: madhav bhargava <madhav.bhargava@sap.com>
* Added missing license headers to new files * Fixed linting issues * Fixed formatting issues Signed-off-by: madhav bhargava <madhav.bhargava@sap.com>
* Fixed scaling in HPA component. * Fixed Role and Rolebinding component which now only create. * Adapted the alias for grove core api in component files. * Initial code for PodGang component. * Added scheduler API as a dependency in go.mod Signed-off-by: madhav bhargava <madhav.bhargava@sap.com>
Signed-off-by: madhav bhargava <madhav.bhargava@sap.com>
…PodCliques`. Signed-off-by: Saketh Kalaga <51327242+renormalize@users.noreply.github.com>
Signed-off-by: madhav bhargava <madhav.bhargava@sap.com>
Signed-off-by: Saketh Kalaga <51327242+renormalize@users.noreply.github.com>
Signed-off-by: Saketh Kalaga <51327242+renormalize@users.noreply.github.com>
* Fixed internal types used by PodGang syncFlow. * PGS reconciler now listens for PCLQ update events. * PodGang CRDs are now copied when deploying grove operator. * Removed syncer.go as this is now replaced with syncflow.go. Signed-off-by: madhav bhargava <madhav.bhargava@sap.com>
* Added code in Pod component to add scheduling gate when creating pods. * Fixed the operator/hack/prepare-local-deploy.sh to reflect the changes in PodGang CRD. Signed-off-by: madhav bhargava <madhav.bhargava@sap.com>
Signed-off-by: Saketh Kalaga <51327242+renormalize@users.noreply.github.com>
* Refactored pclq reconciler. * PCLQ reconciler now watches PodGang create/delete events. * Fixed pcsg.Status.Selector. * Refactored podgang syncflow. Signed-off-by: madhav bhargava <madhav.bhargava@sap.com>
Signed-off-by: Saketh Kalaga <51327242+renormalize@users.noreply.github.com>
Signed-off-by: Saketh Kalaga <51327242+renormalize@users.noreply.github.com>
Signed-off-by: madhav bhargava <madhav.bhargava@sap.com>
* WIP commit for Pod component * Minor rearrangement of hpaInfo in HPA component Signed-off-by: madhav bhargava <madhav.bhargava@sap.com>
Signed-off-by: Saketh Kalaga <51327242+renormalize@users.noreply.github.com>
* Changed the order of components in PGS reconcile spec flow. Signed-off-by: madhav bhargava <madhav.bhargava@sap.com>
Signed-off-by: Saketh Kalaga <51327242+renormalize@users.noreply.github.com>
* Moving `syncExistingPodGangs` to run every reconciliation enables the `schedulingGate`s to be removed. Signed-off-by: Saketh Kalaga <51327242+renormalize@users.noreply.github.com>
Signed-off-by: Saketh Kalaga <51327242+renormalize@users.noreply.github.com>
…to them. * Scaling a `PodClique` in caused unexpected behavior where more than expected number of pods were deleted by the `PodClique` controller. Multiple events are raised during the entire flow, which causes multiple requeues for the same `PodClique`. The `List` call made in the controller for `Pod`s returns the list of `Pod`s in a non-deterministic order, and for each requeue handled by a different worker, a different `Pod` was chosen for deletion. To avoid this, `Pod`s are currently deleted based on `creationTimestamp`. Signed-off-by: Saketh Kalaga <51327242+renormalize@users.noreply.github.com>
… the `PodGang` name. Signed-off-by: Saketh Kalaga <51327242+renormalize@users.noreply.github.com>
Signed-off-by: Saketh Kalaga <51327242+renormalize@users.noreply.github.com>
* Now filtering terminating pods when fetching existing pods in PodGang component. Signed-off-by: madhav bhargava <madhav.bhargava@sap.com>
* Removed usage of Owns and now use Watches to watch for PodClique events. * In PodClique register now listening for PodGang Create/Update/Delete events Signed-off-by: madhav bhargava <madhav.bhargava@sap.com>
…` changes. Signed-off-by: Saketh Kalaga <51327242+renormalize@users.noreply.github.com>
Signed-off-by: madhav bhargava <madhav.bhargava@sap.com>
* Fixed formatting issues. * Refactored the Pod component to fix the PCSG scaling issue. The current implementation had issues. * In this commit, code to delete excess pods is introduced but commented as its not been tested yet. Signed-off-by: madhav bhargava <madhav.bhargava@sap.com>
* Added PCLQ labels to Pods created for the PCLQ. * Introduced pod deletion for excess pods. * Requeue interval moved to constant is usage fixed across PCLQ and PGS reconcilers. * Fixed the issue where too many pods were created when PCSG is scaled. Signed-off-by: madhav bhargava <madhav.bhargava@sap.com>
* Fixed PodGang component where it was eagerly creating PodGangs. Signed-off-by: madhav bhargava <madhav.bhargava@sap.com>
Signed-off-by: Saketh Kalaga <51327242+renormalize@users.noreply.github.com>
Signed-off-by: Saketh Kalaga <51327242+renormalize@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR introduces the following changes:
PodClique.Status.ScheduleGatedReplicasto capture the number of schedule gated replicas for a PCLQ.PodGangresources.PodClique,PodCliqueScalingGroup,Podget additional labelsPod.Spec.SchedulingGates