What happened?
Describe the bug
A PodCliqueSet (PCS) created before the auto-mnnvl feature was introduced cannot be deleted. The resource remains stuck due to a finalizer. Attempts to manually remove the finalizer are blocked by the validation webhook because it incorrectly identifies the missing auto-mnnvl annotation as an "addition" of an immutable field during the patch request.
To Reproduce
- Have a
PodCliqueSet created on an older version of Grove (pre-auto-mnnvl).
- Delete all child resources (Pods, Services, etc.) manually.
- Attempt to delete the PCS:
kubectl delete pcs <name>. The command 'hangs' due to the grove.io/podcliqueset.grove.io.
- Attempt to manually remove the finalizer via
kubectl patch or kubectl edit.
- Error: The validation webhook denies the request:
admission webhook "pcs.validating.webhooks.grove.io" denied the request: metadata.annotations.grove.io/auto-mnnvl: Forbidden: annotation grove.io/auto-mnnvl cannot be added after PodCliqueSet creation
- Attempt to scale the operator to 0 to bypass the logic.
- Error: The defaulting webhook now blocks the patch because the service endpoint is down:
Internal error occurred: failed calling webhook "pcs.defaulting.webhooks.grove.io": ... no endpoints available for service "grove-operator"
Expected behavior
The validation webhook should allow metadata/finalizer updates for existing resources, especially during deletion, even if the auto-mnnvl annotation is missing or being defaulted. It should not block the removal of finalizers on legacy resources.
Actual behavior
The resource is "deadlocked." The validation webhook blocks the manual fix while the operator is running, and the defaulting webhook blocks the manual fix when the operator is stopped.
Workaround
The only way to recover was to manually delete the ValidatingWebhookConfiguration for Grove, remove the finalizer from the PCS, and then restore/reinstall the webhook.
Environment:
- Grove Version:
v0.1.0-alpha.6
- Installation Method: Helm
What happened?
Describe the bug
A
PodCliqueSet(PCS) created before theauto-mnnvlfeature was introduced cannot be deleted. The resource remains stuck due to a finalizer. Attempts to manually remove the finalizer are blocked by the validation webhook because it incorrectly identifies the missingauto-mnnvlannotation as an "addition" of an immutable field during the patch request.To Reproduce
PodCliqueSetcreated on an older version of Grove (pre-auto-mnnvl).kubectl delete pcs <name>. The command 'hangs' due to thegrove.io/podcliqueset.grove.io.kubectl patchorkubectl edit.Expected behavior
The validation webhook should allow metadata/finalizer updates for existing resources, especially during deletion, even if the
auto-mnnvlannotation is missing or being defaulted. It should not block the removal of finalizers on legacy resources.Actual behavior
The resource is "deadlocked." The validation webhook blocks the manual fix while the operator is running, and the defaulting webhook blocks the manual fix when the operator is stopped.
Workaround
The only way to recover was to manually delete the
ValidatingWebhookConfigurationfor Grove, remove the finalizer from the PCS, and then restore/reinstall the webhook.Environment:
v0.1.0-alpha.6