Skip to content

Introduce the Authroizer Webhook.#214

Merged
renormalize merged 16 commits into
ai-dynamo:mainfrom
unmarshall:authorizer
Oct 9, 2025
Merged

Introduce the Authroizer Webhook.#214
renormalize merged 16 commits into
ai-dynamo:mainfrom
unmarshall:authorizer

Conversation

@renormalize

@renormalize renormalize commented Oct 7, 2025

Copy link
Copy Markdown
Contributor

What type of PR is this?

/kind feature

What this PR does / why we need it:

This PR introduces the authorizer webhook, which protects resources created/managed by the grove-operator.

This webhook enforces the user to only modify the API that grove exposes to its consumers, which is the PodCliqueSet. PodClique and PodCliqueScalingGroup resources are not intended for direct modification by the user, and thus are protected by the authorizer webhook now.

  • It segregates users into 3 categories.
    • The reconciler serviceaccount user.
      This user is the reconciler of the grove-operator, and has all access to all resources. It can perform all operations on resources managed by grove-operator.
    • Exempt serviceaccount user.
      • Certain serviceaccounts need to be exempt from the protection that the authorizer webhook provides.
      • The generic-garbage-collector serviceaccount is exempted since this serviceaccount is used for garbage collection of orphaned resources.
    • All other users.
      • All other users can not perform any actions on resources created/managed by grove-operator. They can only modify the PodCliqueSet resource, based on the RBAC the user has.
  • The webhook also introduces the grove.io/disable-managed-resource-protection annotation, which can be added to PodCliqueSet resources, which will cause the authorizer webhook to stop taking action on that particular PodCliqueSet. This feature is useful when a particular PodCliqueSet ends up in an undesired state, and cluster administrators would like to take explicit action to fix the PodCliqueSet and its child resources.

Which issue(s) this PR fixes:

Fixes #202

Special notes for your reviewer:

Does this PR introduce a API change?

The authorizer webhook is introduced which protects resources created/managed by grove-operator, which is enabled by default.

Additional documentation e.g., enhancement proposals, usage docs, etc.:


unmarshall and others added 8 commits October 7, 2025 13:10
* WIP changes for authorizer webhook.

Signed-off-by: madhav bhargava <madhav.bhargava@sap.com>
Signed-off-by: Saketh Kalaga <saketh.kalaga@sap.com>
* Minor refactoring of method signatures in the handler.
* Added labels for authorizer webhook in values.yaml and also in
  helpers.tpl

Signed-off-by: Madhav Bhargava <madhav.bhargava@sap.com>
Signed-off-by: Saketh Kalaga <saketh.kalaga@sap.com>
* `UPDATE` operations were using `Object` instead of `OldObject`.
  Since the labels on the resources are used to establish that a resource
  is managed by Grove, if an update operation modifies these labels, then
  Grove would be unable to ascertain these objects are managed by Grove,
  and unwanted operations might be admitted. To prevent this, `OldObject`
  is used instead.

* `ValidatingWebhookConfiguration` for the authorizer webhook now has a
  seperate rule for pods, which registers only `UPDATE` operations.
  `DELETE` operations must be admitted always, as pods might need to be
  deleted during some point in their lifecycle for various reasons.

Signed-off-by: Saketh Kalaga <saketh.kalaga@sap.com>
* There is no reason to decode `Scale` kinds, since these are only
  created for the `podcliques`, and `podcliquescalinggroups` resources.
  Finding the parent resource does not add any information in deciding
  whether the request is to be admitted or rejected.
  Therefore, if the Kind is `Scale`, the `User` is checked, and the
  request is admitted or denied.

* Explicit check to verify a resource is managed by druid or not is removed.
  This is unnecessary since the requests that the webhook receives are
  already filtered by the `objectSelector` in `ValidatingWebhookConfiguration`.

Signed-off-by: Saketh Kalaga <saketh.kalaga@sap.com>
Signed-off-by: Saketh Kalaga <saketh.kalaga@sap.com>
Signed-off-by: Madhav Bhargava <madhav.bhargava@sap.com>
* `/scale` events are not handled anymore by the webhook. The actor
  needs to have RBAC to scale the `podcliques`, and `podcliquescalinggroups`
  subresources.

* Introduce a `handleCreate()` in the authorizer webhook that handles
  create operations of resources managed by the Grove operator. Only the
  operator service account has permission to create these resources.

Signed-off-by: Saketh Kalaga <saketh.kalaga@sap.com>
Signed-off-by: Saketh Kalaga <saketh.kalaga@sap.com>
Signed-off-by: Saketh Kalaga <saketh.kalaga@sap.com>
Signed-off-by: Saketh Kalaga <saketh.kalaga@sap.com>
* Authorizer webhook is enabled by default in `values.yaml`.

* Remove `isEnabled` for validation and defaulting webhooks from `values.yaml`

Signed-off-by: Saketh Kalaga <saketh.kalaga@sap.com>
@renormalize renormalize marked this pull request as ready for review October 8, 2025 15:47
@renormalize renormalize self-assigned this Oct 8, 2025
Comment thread operator/api/config/validation/validation.go Outdated
Comment thread operator/api/config/validation/validation.go Outdated
Comment thread operator/api/config/validation/validation_test.go Outdated
Comment thread operator/charts/templates/_helpers.tpl
Comment thread operator/internal/webhook/admission/pcs/authorization/decoder.go
Comment thread operator/internal/webhook/admission/pcs/authorization/handler.go Outdated
Comment thread operator/internal/webhook/admission/pcs/authorization/handler.go Outdated
Comment thread operator/internal/webhook/admission/pcs/authorization/handler.go Outdated
Comment thread operator/internal/webhook/admission/pcs/authorization/handler_test.go Outdated
@unmarshall unmarshall added the enhancement New feature or request label Oct 8, 2025
…nology, remove an unnecessary file.

Signed-off-by: Saketh Kalaga <saketh.kalaga@sap.com>
* `AuthorizerConfig.ReconcilerServiceAccountUserName` is removed from the
  `OperatorConfiguration` since it is redundant. This can be inferred
  through `DownwardAPI`, and reduces configuration required from the user.

* Remove validations for `AuthorizerConfig.ReconcilerServiceAccountUserName`.

* Include `GROVE_OPERATOR_SERVICE_ACCOUNT_NAME` environment vairable in
  the grove operator deployment.

* Construct the serviceaccount username from the
  GROVE_OPERATOR_SERVICE_ACCOUNT_NAME` environment variable, and the
  namespace file with filepath defined in `constants.OperatorNamespaceFile`.

Signed-off-by: Saketh Kalaga <saketh.kalaga@sap.com>
Signed-off-by: Saketh Kalaga <saketh.kalaga@sap.com>
@renormalize renormalize merged commit 520cc00 into ai-dynamo:main Oct 9, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Authorizer Webhook

2 participants