Skip to content

🐛 Fix fake client's SSA status patch resource version check#3443

Merged
k8s-ci-robot merged 7 commits intokubernetes-sigs:mainfrom
josvazg:fix/ssa-status-patch-resourceversion
Feb 2, 2026
Merged

🐛 Fix fake client's SSA status patch resource version check#3443
k8s-ci-robot merged 7 commits intokubernetes-sigs:mainfrom
josvazg:fix/ssa-status-patch-resourceversion

Conversation

@josvazg
Copy link
Copy Markdown
Contributor

@josvazg josvazg commented Jan 28, 2026

This change includes a test showing a sample SSA on a Pod status which failed before the fix with the "object was modified".

It also includes the fix in the fake client, which is copy the resource version from the old accessor as it was already done for non status patch and apply operations.

Advice on simplifying the reproducer is welcome.

Fixes #3442

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Welcome @josvazg!

It looks like this is your first PR to kubernetes-sigs/controller-runtime 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/controller-runtime has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jan 28, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Hi @josvazg. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jan 28, 2026
Expect(updatedPod.ResourceVersion).NotTo(Equal("1"))
originalRV := updatedPod.ResourceVersion

// Now try to do an SSA apply on the status with an explicitly mismatched resourceVersion.
Copy link
Copy Markdown
Member

@alvaroaleman alvaroaleman Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can not confirm the behavior described here against a real kube apiserver: If you pass an incorrect RV the request will fail (you can test this with kubectl apply --server-side). Only if you pass no RV (which I think is the common case for SSA) will it succeed regardless of server RV.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, makes sense. Still the code from which we hit this was failing even after we were wiping the resource version clean right before the SSA, when in version 0.22 it was working fine without changes.

Let me double check.

Copy link
Copy Markdown
Contributor Author

@josvazg josvazg Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to dig deeper.

When testing this again removing the fix and rv value I get it to pass. But our original test still fails.

Comparing if accessor.GetResourceVersion() != oldAccessor.GetResourceVersion() from my controller-runtime checkout (using this test modified) the comparison is rv = "1" == old rv = "1" so it passes. But from our code I saw rv="" == old rv = "1000"

Not sure what is the difference between both tests. Also not sure how the passing test is comparing "1 against 1" even though we wiped the rv before calling SSA.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We think we found it:

	if accessor.GetResourceVersion() == "" {
		switch {
		case allowsUnconditionalUpdate(gvk):
			accessor.SetResourceVersion(oldAccessor.GetResourceVersion())
			// This is needed because if the patch explicitly sets the RV to null, the client-go reaction we use
			// to apply it and whose output we process here will have it unset. It is not clear why the Kubernetes
			// apiserver accepts such a patch, but it does so we just copy that behavior.
			// Kubernetes apiserver behavior can be checked like this:
			// `kubectl patch configmap foo --patch '{"metadata":{"annotations":{"foo":"bar"},"resourceVersion":null}}' -v=9`
		case bytes.Contains(debug.Stack(), []byte("sigs.k8s.io/controller-runtime/pkg/client/fake.(*fakeClient).Patch")):
			// We apply patches using a client-go reaction that ends up calling the trackers Update. As we can't change
			// that reaction, we use the callstack to figure out if this originated from the "fakeClient.Patch" func.
			accessor.SetResourceVersion(oldAccessor.GetResourceVersion())
		case bytes.Contains(debug.Stack(), []byte("sigs.k8s.io/controller-runtime/pkg/client/fake.(*fakeClient).Apply")):
			// We apply patches using a client-go reaction that ends up calling the trackers Update. As we can't change
			// that reaction, we use the callstack to figure out if this originated from the "fakeClient.Apply" func.
			accessor.SetResourceVersion(oldAccessor.GetResourceVersion())
		}
	}

This POD test cannot reproduce the issue we hit because it is a core resource and follows the first case case allowsUnconditionalUpdate(gvk) which fixed the RV to match

Our resource is a CRD, and we are setting a subresource on it. So we are missing this case:

case bytes.Contains(debug.Stack(), []byte("sigs.k8s.io/controller-runtime/pkg/client/fake.(*fakeSubResourceClient).Apply")):
			// We apply patches using a client-go reaction that ends up calling the trackers Update. As we can't change
			// that reaction, we use the callstack to figure out if this originated from the "fakeClient.Patch" func.
			accessor.SetResourceVersion(oldAccessor.GetResourceVersion())

I shall revamp this PR:

  1. Test must reproduce the issue using a CRD, not a core resource.
  2. The old code fix goes away.
  3. The new fix is just the new entry case in that are missing.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all in-tree resources allow unconditional update, so you might find one there you can use for testing

Copy link
Copy Markdown
Contributor Author

@josvazg josvazg Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the end I used an adhoc custom object. It took longer to verify because if the test registers an unstructured custom object, the issue is not reproduced. In that case, the oldAccessor is set with an empty resourceVersion which matches the one in the accessor.

Only when the custom object registered with a proper type, like in our case, you would hit the issue. In that case, the oldAccessor is set with a non empty resourceVersion which does not match the one in the accessor.

This was tricky to figure out, I had to debug our failing test and the reproducer until I spotted the difference between both.

This PR is split into 2 commits, one with the test and one with the fix. If you run without the fix commit you can see the error we hit in our case.

@fedepaol
Copy link
Copy Markdown

@josvazg @alvaroaleman just got to the same conclusion 😅 with #3444

Closing it now as this was here before, let me know otherwise.

Signed-off-by: jose.vazquez <jose.vazquez@mongodb.com>
Signed-off-by: jose.vazquez <jose.vazquez@mongodb.com>
@josvazg josvazg force-pushed the fix/ssa-status-patch-resourceversion branch from bd7b274 to 1fa48c2 Compare January 29, 2026 14:54
@josvazg josvazg changed the title 🐛 Fix SSA status patch resource version check 🐛 Fix fake client's SSA status patch resource version check Jan 29, 2026
Expect(cl.Status().Apply(ctx, node, client.FieldOwner("test-owner"))).To(Succeed())
})

It("should allow SSA apply on status without object has changed issues", func(ctx SpecContext) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also add a test that validates that status apply requests with an invalid RV fail?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alvaroaleman I am confused. Maybe I misunderstood some details here.
I would assume that the RV value sent on an SSA will be ignored. In fact, would not the applied config that is what we pass the SSA operation have that already removed?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, if you send an SSA request with an RV in it and it doesn't match, the apiserver fails the request: #3443 (comment)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah! OK makes sense.

Test added now

@alvaroaleman
Copy link
Copy Markdown
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 29, 2026
@josvazg josvazg force-pushed the fix/ssa-status-patch-resourceversion branch from 025d487 to 31564ac Compare January 29, 2026 16:03
@sbueringer
Copy link
Copy Markdown
Member

/assign

(I'll review once Alvaro is fine with it)

@josvazg josvazg force-pushed the fix/ssa-status-patch-resourceversion branch from 31564ac to d80043e Compare January 29, 2026 16:34
Copy link
Copy Markdown
Member

@alvaroaleman alvaroaleman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for debugging this! Two small comments

// We apply patches using a client-go reaction that ends up calling the trackers Update. As we can't change
// that reaction, we use the callstack to figure out if this originated from the "fakeClient.Apply" func.
accessor.SetResourceVersion(oldAccessor.GetResourceVersion())
case bytes.Contains(debug.Stack(), []byte("sigs.k8s.io/controller-runtime/pkg/client/fake.(*fakeSubResourceClient).Patch")):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind using a single case statement for Apply, subresource.Patch and subresource.Apply that ORs them? We don't really need the same comment three times

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah it was getting too verbose now

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


// This is expected to fail with the wring rv value passed in in the applied config
err := cl.Status().Apply(ctx, resourceAC, client.FieldOwner("test-owner"), client.ForceOwnership)
Expect(err).To(HaveOccurred(), "SSA apply on status should not succeed when resourceVersion is wrongly set")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check that we got the correct error here as well and not just any error:

Expect(apierrors.IsConflict(err)).To(BeTrue())

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

@alvaroaleman alvaroaleman added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Jan 29, 2026
@alvaroaleman
Copy link
Copy Markdown
Member

/cherrypick release-0.23

@k8s-infra-cherrypick-robot
Copy link
Copy Markdown

@alvaroaleman: once the present PR merges, I will cherry-pick it on top of release-0.23 in a new PR and assign it to you.

Details

In response to this:

/cherrypick release-0.23

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Signed-off-by: jose.vazquez <jose.vazquez@mongodb.com>
@josvazg josvazg force-pushed the fix/ssa-status-patch-resourceversion branch from d80043e to b0fee5c Compare January 30, 2026 08:52
Copy link
Copy Markdown
Member

@alvaroaleman alvaroaleman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/hold
so @sbueringer can have a look

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 30, 2026
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 30, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

LGTM label has been added.

DetailsGit tree hash: d973602ef2870ada24f9fa483171303afa1c07f9

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 30, 2026
Copy link
Copy Markdown
Member

@sbueringer sbueringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much. Makes sense, just two nits

Co-authored-by: Stefan Büringer <4662360+sbueringer@users.noreply.github.com>
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 30, 2026
Co-authored-by: Stefan Büringer <4662360+sbueringer@users.noreply.github.com>
@sbueringer
Copy link
Copy Markdown
Member

Thx again!

/lgtm
/approve
/hold cancel

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Feb 2, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

LGTM label has been added.

DetailsGit tree hash: 471359de1349cdea8c889622043f98f79ae1e0ba

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alvaroaleman, josvazg, sbueringer

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [alvaroaleman,sbueringer]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 8ebd0ff into kubernetes-sigs:main Feb 2, 2026
9 checks passed
@k8s-infra-cherrypick-robot
Copy link
Copy Markdown

@alvaroaleman: new pull request created: #3446

Details

In response to this:

/cherrypick release-0.23

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fake client on version 0.23.1 fails to SSA Apply status due to resourceVersion checks

6 participants