Skip to content

Support external certificate management for webhooks#344

Merged
gflarity merged 36 commits into
ai-dynamo:mainfrom
gflarity:cert_management_cleanup
Feb 3, 2026
Merged

Support external certificate management for webhooks#344
gflarity merged 36 commits into
ai-dynamo:mainfrom
gflarity:cert_management_cleanup

Conversation

@gflarity

@gflarity gflarity commented Jan 17, 2026

Copy link
Copy Markdown
Contributor

What type of PR is this?

/kind feature

What this PR does / why we need it:

Note this PR was originally #233. I have taken it over it order to implement PR feedback and get things moving faster.

This PR adds support for external certificate management for Grove's webhook server. Users can now choose between:

  1. Auto-provisioned certificates (default): Grove's built-in cert-controller automatically generates and manages self-signed certificates - best for development/testing
  2. External certificates: Users can provide their own certificates (e.g., via cert-manager or manually) - best for production environments

Key changes:

  • Added autoProvision configuration option (config.server.webhooks.autoProvision) to toggle between auto-provisioned and external certificates
  • Added secretName configuration to specify the Kubernetes Secret containing webhook certificates
  • Added webhooks.caBundle Helm value to inject a custom CA bundle into webhook configurations
  • Added comprehensive e2e tests covering both certificate management modes and transitions between them

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

  • When autoProvision: true (default), behavior is unchanged - Grove generates and manages certificates automatically
  • When autoProvision: false, the operator expects certificates to exist in the specified Secret before starting
  • The e2e tests verify transitioning from cert-manager managed certificates to autoprovision mode and back
  • Webhook configurations now support an optional caBundle value for external CA certificates

Does this PR introduce a API change?

Added support for external certificate management. Users can now disable auto-provisioned certificates by setting `config.server.webhooks.autoProvision: false` and provide their own certificates (e.g., via cert-manager). A new `webhooks.caBundle` Helm value allows injecting custom CA bundles into webhook configurations.

Additional documentation e.g., enhancement proposals, usage docs, etc.:

Added docs/user-guide/certificate-management.md

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for external certificate management for Grove's webhook server, allowing users to choose between automatic certificate provisioning (default) and external certificate management (e.g., via cert-manager). This provides flexibility for both development/testing environments and production deployments with enterprise PKI requirements.

Changes:

  • Added autoProvision and secretName configuration options to toggle between auto-provisioned and external certificates
  • Added caBundle Helm value to support custom CA certificates in webhook configurations
  • Implemented comprehensive e2e tests for certificate management mode transitions

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
operator/internal/controller/cert/cert.go Modified ManageWebhookCerts to support both auto-provisioned and external certificate modes
operator/e2e/utils/k8s_client.go Exported applyYAMLData function as ApplyYAMLData for use in new tests
operator/e2e/tests/cert_management_test.go Added comprehensive e2e test suite for certificate management round-trip scenarios
operator/e2e/setup/helm.go Added UpgradeHelmChart and UninstallHelmChart functions with timeout support
operator/e2e/setup/grove.go Added UpgradeGrove function for upgrading Grove configuration
operator/e2e/dependencies.yaml Added cert-manager as a test dependency
operator/e2e/dependencies.go Added cert-manager to HelmCharts struct
operator/cmd/main.go Updated main to pass new certificate configuration parameters
operator/charts/values.yaml Added certificate management configuration options and documentation
operator/charts/templates/webhook-server-cert-secret.yaml Made secret creation conditional based on autoProvision setting
operator/charts/templates/pcs-validating-webhook-config.yaml Added support for custom annotations and caBundle
operator/charts/templates/pcs-defaulting-webhook-config.yaml Added support for custom annotations and caBundle
operator/charts/templates/deployment.yaml Made certificate directory and secret name configurable
operator/charts/templates/authorizer-webhook-config.yaml Added support for custom annotations and caBundle
operator/charts/templates/_helpers.tpl Added certificate configuration to config template
operator/api/config/v1alpha1/zz_generated.deepcopy.go Added deep copy support for new AutoProvision field
operator/api/config/v1alpha1/types.go Added SecretName and AutoProvision fields to WebhookServer struct
operator/api/config/v1alpha1/defaults.go Added defaults for new certificate configuration fields
docs/user-guide/certificate-management.md Added comprehensive certificate management documentation
docs/installation.md Added reference to certificate management guide
docs/api-reference/operator-api.md Updated API documentation with new certificate fields
.github/workflows/e2e-test.yaml Added cert_management test to CI workflow

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread operator/e2e/tests/cert_management_test.go Outdated
Comment thread operator/api/config/v1alpha1/types.go Outdated

@Ronkahn21 Ronkahn21 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will complete the review later today

Comment thread operator/charts/values.yaml
Comment thread operator/e2e/setup/grove.go Outdated
Comment thread operator/e2e/setup/grove.go Outdated
Comment thread operator/e2e/setup/helm.go
@gflarity gflarity force-pushed the cert_management_cleanup branch from f5681f1 to df66b3b Compare January 22, 2026 20:36
@gflarity gflarity requested a review from Ronkahn21 January 22, 2026 20:37
Comment thread operator/e2e/setup/grove.go Outdated
Comment thread operator/e2e/setup/grove.go Outdated
Comment thread operator/e2e/utils/k8s_client.go Outdated
Comment thread operator/internal/controller/cert/cert.go
renormalize
renormalize previously approved these changes Jan 28, 2026

@renormalize renormalize left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! cert-manager and manual cert generation docs are both perfect.

Comment thread docs/user-guide/certificate-management.md Outdated
Comment thread operator/api/config/v1alpha1/defaults.go Outdated
Comment thread operator/e2e/utils/k8s_client.go Outdated
@gflarity

gflarity commented Jan 28, 2026

Copy link
Copy Markdown
Contributor Author

I've implemented the feedback on reusing clients rather than constructing time for this specific test suite. I don't want to expand scope to refactor this every where right now. But I created an issue to track doing this eventually
#378

@gflarity gflarity force-pushed the cert_management_cleanup branch 2 times, most recently from 2cb43cf to 0de5671 Compare January 28, 2026 19:04
renormalize
renormalize previously approved these changes Jan 29, 2026

@Ronkahn21 Ronkahn21 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last comment, but LGTM

Comment thread operator/e2e/utils/k8s_client.go Outdated
Ronkahn21
Ronkahn21 previously approved these changes Jan 29, 2026
Signed-off-by: kangclzjc <kangz@nvidia.com>
Signed-off-by: kangclzjc <kangz@nvidia.com>
@copy-pr-bot

copy-pr-bot Bot commented Feb 3, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@gflarity gflarity merged commit 185ab67 into ai-dynamo:main Feb 3, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants