Introduce validations for TopologyConstraints in PodCliqueSet#317
Conversation
* Introduced validations for TopologyConstraints defined at different levels in PodCliqueSet. Signed-off-by: Madhav Bhargava <madhav.bhargava@sap.com>
Signed-off-by: Madhav Bhargava <madhav.bhargava@sap.com>
Signed-off-by: Madhav Bhargava <madhav.bhargava@sap.com>
especially when you wish to recreate KAI topologies CR to reflect changes in ClusterTopology CR. KAI topologies are immutable and therefore needs to be deleted and then created afresh. Signed-off-by: Madhav Bhargava <madhav.bhargava@sap.com>
| ) | ||
|
|
||
| // IsTopologyDomainNarrower returns true if the current TopologyDomain is narrower in scope than the other TopologyDomain. | ||
| func (d TopologyDomain) IsTopologyDomainNarrower(other TopologyDomain) bool { |
There was a problem hiding this comment.
The method itself is defined on a topology domain. Including that again in the method name seems redundant and too verbose to me. The suggested name reads more like the stdlib functions.
| func (d TopologyDomain) IsTopologyDomainNarrower(other TopologyDomain) bool { | |
| func (d TopologyDomain) IsNarrower(other TopologyDomain) bool { |
| return fmt.Errorf("failed adding %s webhook handler: %v", defaulting.Name, err) | ||
| } | ||
| pcsValidatingWebhook := pcsvalidation.NewHandler(mgr) | ||
| pcsValidatingWebhook := pcsvalidation.NewHandler(mgr, tasConfig) |
There was a problem hiding this comment.
Seems like we're using slog for logging in this file. Let's remove this dependency and use the logger provided by the handler, like the rest of the codebase. We can also do this in a later clean-up if not now.
|
|
||
| // ---------------------------- Helper Functions ---------------------------- | ||
|
|
||
| // getDefaultTASConfig returns a default TAS configuration with TAS disabled. | ||
| func getDefaultTASConfig() groveconfigv1alpha1.TopologyAwareSchedulingConfiguration { | ||
| return groveconfigv1alpha1.TopologyAwareSchedulingConfiguration{ | ||
| Enabled: false, | ||
| } | ||
| } |
There was a problem hiding this comment.
Do we really need this function if we're only setting Enabled: false which is going to be the case anyway for TopologyAwareSchedulingConfiugration? The default is just a zero value struct instance. This was done in register_test.go.
| // ---------------------------- Helper Functions ---------------------------- | |
| // getDefaultTASConfig returns a default TAS configuration with TAS disabled. | |
| func getDefaultTASConfig() groveconfigv1alpha1.TopologyAwareSchedulingConfiguration { | |
| return groveconfigv1alpha1.TopologyAwareSchedulingConfiguration{ | |
| Enabled: false, | |
| } | |
| } |
|
|
||
| // validatePodGangTemplateSpecUpdate validates updates to the template specification ensuring immutability of critical fields. | ||
| func validatePodGangTemplateSpecUpdate(newSpec, oldSpec *grovecorev1alpha1.PodCliqueSetTemplateSpec, fldPath *field.Path) field.ErrorList { | ||
| func (v *pcsValidator) validatePodGangTemplateSpecUpdate(oldPCS *grovecorev1alpha1.PodCliqueSet, fldPath *field.Path) field.ErrorList { |
There was a problem hiding this comment.
We don't really have a PodGangTemplate anymore after #186. Shouldn't this be:
| func (v *pcsValidator) validatePodGangTemplateSpecUpdate(oldPCS *grovecorev1alpha1.PodCliqueSet, fldPath *field.Path) field.ErrorList { | |
| func (v *pcsValidator) validatePodCliqueSetTemplateSpecUpdate(oldPCS *grovecorev1alpha1.PodCliqueSet, fldPath *field.Path) field.ErrorList { |
| // defaultTASConfig returns a default TAS configuration with TAS disabled. | ||
| // This is used for all podcliqueset validation tests since topology constraint | ||
| // validation is tested separately in topologyconstraints_v1_test.go. | ||
| func defaultTASConfig() groveconfigv1alpha1.TopologyAwareSchedulingConfiguration { | ||
| return groveconfigv1alpha1.TopologyAwareSchedulingConfiguration{ | ||
| Enabled: false, | ||
| } | ||
| } |
| topologyDomains := lo.Map(tasConfig.Levels, func(level grovecorev1alpha1.TopologyLevel, _ int) string { | ||
| return string(level.Domain) | ||
| }) |
There was a problem hiding this comment.
This can be done within newTopologyConstraintsValidator, as tasConfig is not used in any other validator.
newTopologyConstraintsValidator can accept a groveconfigv1alpha1.TopologyAwareSchedulingConfiguration instead of []string, and the constructor can map to []string.
pcsValidator would contain tasConfig as is.
| } | ||
| } | ||
|
|
||
| // validate performs validation specific to PodCliqueSet creates. |
There was a problem hiding this comment.
This function is being called for both PCS create and updates. pcsValidator.validate() is being invoked in both handler.ValidateCreate() and handler.ValidateUpdate()
Can the comment be corrected?
Also, according to the doc string for disallowConstraintsForCreateWhenTASIsDisabled(), it only has to be invoked during creates, but at this moment it is being invoked for both create and update; this has to be corrected.
| func buildClusterTopology(name string, topologyLevels []grovecorev1alpha1.TopologyLevel) *grovecorev1alpha1.ClusterTopology { | ||
| sortedTopologyLevels := make([]grovecorev1alpha1.TopologyLevel, len(topologyLevels)) | ||
| copy(sortedTopologyLevels, topologyLevels) | ||
|
|
There was a problem hiding this comment.
Just for my own education, why did you end up removing this?
There was a problem hiding this comment.
I'd like to know as well.
What type of PR is this?
/kind feature
What this PR does / why we need it:
Adds validations for
TopologyConstraintwhen creating/updatingPodCliqueSetresources. Existing validating webhook has been enhanced to additionally validate PCS resources for any violations as defined in #304.Which issue(s) this PR fixes:
Fixes #304
Special notes for your reviewer:
Does this PR introduce a API change?
Additional documentation e.g., enhancement proposals, usage docs, etc.: