Skip to content

Use Apply instead of Update to modify resources in tests#5077

Merged
jetstack-bot merged 5 commits intocert-manager:masterfrom
irbekrm:tests_apply
Apr 29, 2022
Merged

Use Apply instead of Update to modify resources in tests#5077
jetstack-bot merged 5 commits intocert-manager:masterfrom
irbekrm:tests_apply

Conversation

@irbekrm
Copy link
Copy Markdown
Contributor

@irbekrm irbekrm commented Apr 28, 2022

This PR attempts to fix resource update conflicts in e2e and integration tests (such as this) by making updates to test resources via server-side apply instead of Update/UpdateStatus.

NONE

/kind cleanup

@jetstack-bot jetstack-bot added release-note-none Denotes a PR that doesn't merit a release note. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. dco-signoff: yes Indicates that all commits in the pull request have the valid DCO sign-off message. area/testing Issues relating to testing approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 28, 2022
irbekrm added 4 commits April 28, 2022 14:41
Signed-off-by: irbekrm <irbekrm@gmail.com>
Signed-off-by: irbekrm <irbekrm@gmail.com>
Signed-off-by: irbekrm <irbekrm@gmail.com>
Copy link
Copy Markdown
Member

@wallrj wallrj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good @irbekrm

I didn't understand the need to set ManagedFields = nil and there are a couple of other things which I think should be explained in comments, but happy to lgtm when you've answered or addressed those points.

This should fix many of the E2E test flakes, right?

cr.Spec.Username = "abc"
cr.Spec.UID = "123"
cr, err = f.CertManagerClientSet.CertmanagerV1().CertificateRequests(f.Namespace.Name).Update(context.TODO(), cr, metav1.UpdateOptions{})
_, err = internalcertificaterequests.Apply(context.Background(), f.CertManagerClientSet, fieldManager, cr)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between context.TODO and context.Background ? Isn't there a cancellable context we should use in these tests? If you agree, then maybe add a TODO item to fix that.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you think of some cases where we would need a cancellable context in tests? I've been thinking about that too, but then it seems like in all cases where there would be a need to time out the connection, the test would time out and clean up resources anyway?
As per my understanding there is no difference between context.TODO and context.Background functionally, but the first signals that it is not yet decided what context should be used https://pkg.go.dev/context#TODO

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. It's more complicated than I thought. I had a notion that Gingko provided a cancellable context for each test which it cancelled if e.g. ctrl-c is pressed or if a test time limit was reached. But I think I'm wrong about that.

There's an interesting ongoing discussion in onsi/ginkgo#969 about how Kubernetes E2E test suites can provide per-test timeouts, etc

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for linking that, I've been reading that and the linked discussions.
At the moment I still don't have a clear picture of what resources (kube resources, goroutines, open connections to apiserver etc) we could potentially have still hanging around in what cases (i.e a spec failure) that we could clean up sooner if there was a cancellable context. Will keep looking at this.
Specifically in case of those apply calls, there will be the default 34s timeout I guess.

I also think we should now upgrade to Ginkgo to v2 so that we can benefit from any improvements in that version #4940

Comment on lines +57 to +63
// This should be a sufficiently unique and descriptive name
certName := fmt.Sprintf("test-additional-output-formats-%d", time.Now().Unix())

crt := &cmapi.Certificate{
ObjectMeta: metav1.ObjectMeta{
GenerateName: "test-additional-output-formats-",
Namespace: f.Namespace.Name,
Name: certName,
Namespace: f.Namespace.Name,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have you changed this? I thought that the Apply function below would return the crt with the Name field generated by the API server.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand, GenerateName cannot be used with server side apply. I actually don't find any documentation around it, I assume it's to do with where the name gets generated and that Patch really specifies just a patch to be applied to either an existing object or one-to-be created and that GenerateName can only be used with newly created resources, but I am not sure. Might look into it and ask a question upstream.

You can see the error that kubectl would throw if metadata.GenerateName is used with server-side apply here and can observe it if you kubectl apply --server-side=true -f <some-resource-config-with-generate-name-directive>. I observed a 'name cannot be empty' error when trying to use GenerateName when applying this resource with the versioned client.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I did not realise that! I am a "Doubting Thomas" so had to go and check for myself.

$ kubectl apply -f internal/cmd/manifests/cm1.yaml 
error: from foo-bar-: cannot use generate name with apply

$ go run ./
foo
2022/04/29 12:13:45 resource name may not be empty
exit status 1

``

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I need to make a mental note to stop using GenerateName in future.

Comment on lines -203 to +207
crt, err = cmCl.CertmanagerV1().Certificates(namespace).UpdateStatus(ctx, crt, metav1.UpdateOptions{})
err = internalcertificates.ApplyStatus(ctx, cmCl, fieldManager, crt)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it doesn't matter, but the original code used to update crt with the latest content from the API server.
I see that below here we poll the API server for the latest content anyway.

But my question is....why does Apply return the API object while ApplyStatus does not? That seems inconsistent. And patching the status will update the Metadata.ResourceVersion right? Or is that totally irrelevant when using server-side-apply?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And patching the status will update the Metadata.ResourceVersion right? Or is that totally irrelevant when using server-side-apply?

Yes as I understand the resource version is irrelevant. I think that the resource version being different is what makes our Update operations fail on conflicts (as the version is different in etcd because it has been bumped due to other controller's changes etc). With SSA resource version can be omitted and I think that happens when we deepcopy object meta as deepcopy function omits the resource version field.

I think that 'apply status' functions don't return a resource simply because it's never needed after a status update..
In tests where I was making changes I mostly needed the returned object only to have the correct metadata.generation field which is used by WaitForCertificateReadyAndDoneIssuing function.

Copy link
Copy Markdown
Contributor Author

@irbekrm irbekrm Apr 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With SSA resource version can be omitted and I think that happens when we deepcopy object meta as deepcopy function omits the resource version field.

(In general I think it would be good if we had functionality that can extract all fields owned by a field manager and only include those in a spec update like it's done in client-go applyconfigurations so that we don't accidentally co-own other manager's field when applying a spec update. Perhaps the code generators that we use should be able to generate such functionality (as well as Apply methods) for custom resources.
But I think this is not an issue for now as we don't really have different managers managing object specs.)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for those links, I wasn't aware of that work!
So, we need to keep an eye on this

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, I see that there already is https://github.com/kubernetes/code-generator/tree/master/cmd/applyconfiguration-gen to generate applyconfigurations. I will see if this can be used to generate applyconfigurations for our custom types.
I guess then it would be just the ability to generate Apply method for our typed clients that would be missing.

@irbekrm
Copy link
Copy Markdown
Contributor Author

irbekrm commented Apr 28, 2022

/test pull-cert-manager-make-e2e-v1-23
/test pull-cert-manager-e2e-v1-19

Signed-off-by: irbekrm <irbekrm@gmail.com>
Copy link
Copy Markdown
Member

@wallrj wallrj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all those explanations and links @irbekrm

This looks good to me.

/lgtm

t.Helper()
testCond := cmapi.CertificateCondition{
Type: cmapi.CertificateConditionType("Test"),
Reason: "TestTwo",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you changed the Reason "TestTwo" > "Test". Does that matter?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I think I called it TestTwo when I originally wrote that test and had some issues whilst running it and for some reason already had a cert named Test in cluster that I didn't want to delete :)

@jetstack-bot jetstack-bot added the lgtm Indicates that a PR is ready to be merged. label Apr 29, 2022
@jetstack-bot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: irbekrm, wallrj

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/testing Issues relating to testing dco-signoff: yes Indicates that all commits in the pull request have the valid DCO sign-off message. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm Indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants