Skip to content

WIP: Graduate ServerSideApply feature gates to Beta#7908

Open
erikgb wants to merge 4 commits intocert-manager:masterfrom
erikgb:graduate-ssa
Open

WIP: Graduate ServerSideApply feature gates to Beta#7908
erikgb wants to merge 4 commits intocert-manager:masterfrom
erikgb:graduate-ssa

Conversation

@erikgb
Copy link
Copy Markdown
Member

@erikgb erikgb commented Aug 10, 2025

Pull Request Motivation

Kind

/kind feature

Release Note

Enable the use of ServerSideApply in all API calls by default.

CyberArk tracker: VC-45578

@cert-manager-prow cert-manager-prow bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. area/acme Indicates a PR directly modifies the ACME Issuer code dco-signoff: yes Indicates that all commits in the pull request have the valid DCO sign-off message. area/testing Issues relating to testing size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Aug 10, 2025
@erikgb erikgb force-pushed the graduate-ssa branch 2 times, most recently from 004b40a to acd1d33 Compare August 14, 2025 21:42
@cert-manager-prow cert-manager-prow bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Aug 14, 2025
@cert-manager-prow cert-manager-prow bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 29, 2025
@cert-manager-prow cert-manager-prow bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 1, 2025
@cert-manager-prow cert-manager-prow bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 15, 2025
@cert-manager-prow cert-manager-prow bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 15, 2025
@cert-manager-prow cert-manager-prow bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Sep 25, 2025
@maelvls maelvls added the cybr Used by CyberArk-employed maintainers to report to line management what's being worked on. label Sep 26, 2025
@maelvls
Copy link
Copy Markdown
Member

maelvls commented Sep 26, 2025

Priority assessment:

Enabling would be the starting point. If we don't do this, we can't work on apply configurations, and can't start using the modern way of doing it, the way controller-runtime does.

@maelvls
Copy link
Copy Markdown
Member

maelvls commented Oct 2, 2025

Let's enable SSA! And disable SSA for our tests and figure it out later.

@cert-manager-prow cert-manager-prow bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 6, 2025
@cert-manager-prow cert-manager-prow bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 8, 2025
@erikgb
Copy link
Copy Markdown
Member Author

erikgb commented Nov 8, 2025

Enabling would be the starting point. If we don't do this, we can't work on apply configurations, and can't start using the modern way of doing it, the way controller-runtime does.

IMO we cannot enable SSA with broken tests, nor run tests with the feature gate disabled. We need to focus on fixing the broken test setup first.

@SgtCoDFish
Copy link
Copy Markdown
Member

SgtCoDFish commented Nov 14, 2025

I've been doing some investigating on how the tests break, and I'm going to put what I've learned here so it's accessible to others. You all might know this already 😁 I've definitely not gone through every failure...

Unit tests

I've been testing a local checkout of this PR. I checked some unit tests with make WHAT=./pkg/controller/certificaterequests/ca/... test I see the expected kinds of errors[1]. This is because we're comparing an initialised struct with the SSA patch, and because of differences in marshalling the time values.

With a patch[2] leveraging cmp I can get rid of the time marshalling differences for these tests, which makes sense to me.

For the other differences, there's no easy way to fix it; the patch is fundamentally a different thing to the object we're comparing it against. In [1] the object has a Request, but the patch (correctly) doesn't. There's no saving the fact that a lot of busywork needs to be done there, if we want the tests to look anything like how they look now.

Actual Behaviour Differences

I can see a fundamentally different behaviour in cert-manager thanks to the integration tests when run against this branch.

test/integration/certificates/issuing_controller_test.go hangs because it tests for the certificate's Issuing condition being removed. I set up a chain on cert-manager without the SSA feature gate and the Issuing condition was removed from the certificate. I set up a chain on cert-manager built from this branch (with SSA enabled by default) and the Issuing condition was not removed.

This doesn't seem tremendously important - it's probably fine for it not to be removed with SSA, I think? - but it's a fundamental difference and it's good the tests have caught it!

EDIT: seems this has been a known thing for a while:

// Remove Issuing status condition
// TODO @joshvanl: Once we move to only server-side apply API calls, this
// should be changed to setting the Issuing condition to False.
apiutil.RemoveCertificateCondition(crt, cmapi.CertificateConditionIssuing)

Extras

[1]:

=== FAIL: pkg/controller/certificaterequests/ca TestSign/a_CertificateRequest_that_transiently_fails_a_secret_lookup_should_backoff_error_to_retry (2.11s)
E1114 16:13:46.721585   85334 ca.go:104] "Failed to get certificate key pair from secret default-unit-test-ns/root-ca-secret" err="this is a network error" logger="cert-manager.sign"
E1114 16:13:46.721810   85334 sync.go:143] "error issuing certificate request" err="this is a network error" logger="cert-manager" related_resource_name="test-issuer" related_resource_namespace="default-unit-test-ns" related_resource_kind="Issuer" related_resource_version="v1"
    context_builder.go:222: [unexpected difference between SSA patch and expected object (-want +got):
          &v1.CertificateRequest{
                ... // 1 ignored field
                ObjectMeta: {Name: "test-cr", Namespace: "default-unit-test-ns"},
                Spec: v1.CertificateRequestSpec{
                        Duration: &{Duration: s"1440h0m0s"},
                        IssuerRef: v1.IssuerReference{
        -                       Name:  "test-issuer",
        +                       Name:  "",
                                Kind:  "Issuer",
                                Group: "cert-manager.io",
                        },
        -               Request: []uint8(
        -                       """
        -                       -----BEGIN CERTIFICATE REQUEST-----
        -                       MIHIMHECAQAwDzENMAsGA1UEAxMEdGVzdDBZMBMGByqGSM49AgEGCCqGSM49AwEH
        -                       A0IABONN/7gMs8e4DrnHr/zJlMuEHu8ivmCdawm9IUL0ujh+K39Zn/fEIJGjNYWi
        -                       bP27s25j5vVPs38vsbOnmUa0Y6KgADAKBggqhkjOPQQDAgNHADBEAiBLA6m1X8BB
        -                       w0X/cwcsX3M2CnlkBIRSB4lH5VAREiaafwIgHW0Xh5cIt9lo8zB1Uua1n/WNfv9A
        -                       emcpHWkRW7E3M1g=
        -                       -----END CERTIFICATE REQUEST-----
        -                       """
        -               ),
        +               Request: nil,
                        IsCA:    true,
                        Usages:  nil,
                        ... // 4 identical fields
                },
                Status: v1.CertificateRequestStatus{
                        Conditions: []v1.CertificateRequestCondition{
                                {
                                        Type:               "Approved",
                                        Status:             "True",
        -                               LastTransitionTime: s"2025-11-14 16:13:32.92129 +0000 GMT m=+0.016580460",
        +                               LastTransitionTime: s"2025-11-14 16:13:32 +0000 GMT",
                                        Reason:             "cert-manager.io",
                                        Message:            "Certificate request has been approved by cert-manager.io",
                                },
                                {
                                        Type:               "Ready",
                                        Status:             "False",
        -                               LastTransitionTime: s"2025-11-14 16:13:32.92129 +0000 GMT m=+0.016580460",
        +                               LastTransitionTime: s"2025-11-14 16:13:32 +0000 GMT",
                                        Reason:             "Pending",
                                        Message:            "Failed to get certificate key pair from secret default-unit-test"...,
                                },
                        },
                        Certificate: nil,
                        CA:          nil,
                        FailureTime: nil,
                },
          }
        , missing action: update status "cert-manager.io/v1, Resource=certificaterequests" in namespace default-unit-test-ns, unexpected action: patch status "cert-manager.io/v1, Resource=certificaterequests" in namespace default-unit-test-ns]

[2]:

diff --git i/pkg/controller/test/actions.go w/pkg/controller/test/actions.go
index 6f1925b9e..d12f1c828 100644
--- i/pkg/controller/test/actions.go
+++ w/pkg/controller/test/actions.go
@@ -19,8 +19,10 @@ package test
 import (
 	"fmt"
 	"reflect"
+	"time"
 
 	"github.com/google/go-cmp/cmp"
+	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
 	"k8s.io/apimachinery/pkg/types"
 	"k8s.io/apimachinery/pkg/util/json"
 	coretesting "k8s.io/client-go/testing"
@@ -117,6 +119,13 @@ func (a *action) Matches(act coretesting.Action) error {
 				}
 				return p.String() == "ObjectMeta.ManagedFields"
 			}, cmp.Ignore()),
+			cmp.Transformer("metav1.Time", func(in *metav1.Time) string {
+				if in == nil {
+					return "<nil>"
+				}
+
+				return in.Time.Format(time.RFC3339)
+			}),
 		)
 		if diff != "" {
 			return fmt.Errorf("unexpected difference between SSA patch and expected object (-want +got):\n%s", diff)

@SgtCoDFish
Copy link
Copy Markdown
Member

I've been digging into this a bit more, and it seems part of the change for the certificate request unit tests I've been digging into is here, called eventually via Ready.

I tried with this patch against master:

diff --git i/pkg/api/util/conditions.go w/pkg/api/util/conditions.go
index df247f29c..97dd16688 100644
--- i/pkg/api/util/conditions.go
+++ w/pkg/api/util/conditions.go
@@ -17,6 +17,8 @@ limitations under the License.
 package util
 
 import (
+	"time"
+
 	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
 	"k8s.io/klog/v2"
 	"k8s.io/utils/clock"
@@ -251,7 +253,7 @@ func SetCertificateRequestCondition(cr *cmapi.CertificateRequest, conditionType
 		Message: message,
 	}
 
-	nowTime := metav1.NewTime(Clock.Now())
+	nowTime := metav1.NewTime(Clock.Now().Truncate(time.Second))
 	newCondition.LastTransitionTime = &nowTime
 
 	// Search through existing conditions

And unit tests fail as expected, with the output much more closely matching what we'd observe in an actual cluster. This process of updating

I think the time comparisons we're making in our unit tests aren't important, since we're using fake clocks anyway. When SSA is on by default, the if statements here are going down different branches, which causes the unit test failures.

@cert-manager-prow cert-manager-prow bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 25, 2025
@cert-manager-prow
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from sgtcodfish. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@cert-manager-prow cert-manager-prow bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 25, 2025
@cert-manager-prow cert-manager-prow bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 22, 2026
@cert-manager-prow cert-manager-prow bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 22, 2026
@cert-manager-prow cert-manager-prow bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 27, 2026
@cert-manager-prow cert-manager-prow bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 28, 2026
@cert-manager-prow cert-manager-prow bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 7, 2026
erikgb added 4 commits March 18, 2026 10:32
Signed-off-by: Erik Godding Boye <egboye@gmail.com>
Signed-off-by: Erik Godding Boye <egboye@gmail.com>
Signed-off-by: Erik Godding Boye <egboye@gmail.com>
Signed-off-by: Erik Godding Boye <egboye@gmail.com>
@cert-manager-prow cert-manager-prow bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 18, 2026
@cert-manager-prow
Copy link
Copy Markdown
Contributor

@erikgb: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cert-manager-master-make-test e1edfdc link true /test pull-cert-manager-master-make-test

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/acme Indicates a PR directly modifies the ACME Issuer code area/testing Issues relating to testing cybr Used by CyberArk-employed maintainers to report to line management what's being worked on. dco-signoff: yes Indicates that all commits in the pull request have the valid DCO sign-off message. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/feature Categorizes issue or PR as related to a new feature. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants