Test finalization for CRs#46439
Conversation
|
Hi @ash2k. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
/sig api-machinery |
b2fe82b to
1a3eddc
Compare
There was a problem hiding this comment.
Made me think. Do we have a test anywhere that makes sure that this precondition is actually respected? 95% sure it works, but I don't think I've explicitly checked.
There was a problem hiding this comment.
I understand this test, but its an esoteric flow. Can you add comments explaining that deleting something with a finalizer sets deletiontimestamp, but leaves the item until the finalizer list is empty and someone issues another delete?
There was a problem hiding this comment.
I think you should be able to issue another delete and have the item remain since the finalizer isn't finished. Can you confirm with a normal resource and add this check if so?
There was a problem hiding this comment.
I'm surprised the final delete wasn't instantaneous. You confirmed that you had to wait? Any idea why?
There was a problem hiding this comment.
I haven't confirmed whether it was or was not instantaneous - I'm following my past experience with deletes where a delete of an object does not remove it from api immediately and you can GET it and observe the deletion timestamp set. And then it disappears.
There was a problem hiding this comment.
Works without Polling as well. 👍
|
A few comments, but this looks very good. Glad it works for you :) |
|
Well, it works 50% of the time. It's extremely flaky. Here are some failures: This is obviously not acceptable to merge. I don't think it is a test though. Looks like some server-side issues. |
|
The first one is really weird - it is an UPDATE conflict but resource version is the same "5" before the update and after an update failure. |
|
Two new types of failures. This one is especially alarming: And this is the UPDATE conflict with mode details logged: |
|
@deads2k is better equipped to explain those errors but fwiw, the tests are passing for me everytime! :) |
There was a problem hiding this comment.
Remove the poll. It should not flake, but if it does, we'll chase it.
|
@k8s-bot ok to test |
|
I haven't managed to see it flake. Let's see what CI says. Fix the poll and squash please. |
e15f7f9 to
60cd40b
Compare
58f70c0 to
50bbb1c
Compare
|
@ash2k staging godeps want to be updated. |
50bbb1c to
b5a9ce5
Compare
|
@sttts hey, thanks for the tip - this is all new to me, I thought the failures were caused by something else, not my pr. How do I do this? I tried running Looks like |
No problem :) Can you try to run hack/godep-restore.sh? Might be that this must be called once before. |
b5a9ce5 to
4cad36f
Compare
|
So the solution was to follow https://github.com/kubernetes/community/blob/master/contributors/devel/godep.md to the letter. Just using those scripts with my existing gopath didn't work. Magic! |
|
@deads2k please take a look at the CI test failure - it is something I'm getting locally too, so cannot just be my local (incorrect?) setup: |
There was a problem hiding this comment.
You need a retry loop here which does the get-change-update pattern as long as errors.IsConflict(err) is true.
Background: the controllers in the master will update the object. If you use the your old copy in line 74 the apiserver will complain that your version is outdated. Even line 64 will modify the object (it sets the deletion timestamp).
There was a problem hiding this comment.
That explains the spurious failures. I didn't know some other controllers will/may modify the object concurrently.
However then the question is why it was passing sometimes if line 64 always modifies it before the update? I tried to troubleshoot this earlier (see last block in comment #46439 (comment)) and found that resource version was the same before and after the failed update operation so I thought there should be no conflict even though the conflict is reported. Weird. Delete does not increment the resource version but is checked on conditional update?
Anyway, I've updated the code. Thanks for the help!
4cad36f to
848147f
Compare
|
/approve |
There was a problem hiding this comment.
package moved during promotion to beta.
|
@k8s-bot test this |
848147f to
5ac9d86
Compare
|
@k8s-bot pull-kubernetes-e2e-gce-etcd3 test this |
|
Automatic merge from submit-queue (batch tested with PRs 43505, 45168, 46439, 46677, 46623) |
|
/kind cleanup |
What this PR does / why we need it:
Updates #45511 with a test for finalizers for CRDs.
Release note:
@deads2k