Graduate PodLifecycleSleepAction to GA#128046
Conversation
SergeyKanzhelev
left a comment
There was a problem hiding this comment.
just a test classification suggestion, please fix
bf786f4 to
ba66acb
Compare
55be0cc to
bbf3b3a
Compare
|
/triage accepted |
| // longer than 5 seconds (pod should sleep for 5 seconds) | ||
| // shorter than gracePeriodSeconds (default 30 seconds here) | ||
| if !validDuration(cost, 5, 30) { | ||
| // shorter than gracePeriodSeconds (60 seconds here) |
There was a problem hiding this comment.
The same reason why change to 30 in another case: short duration led to flake tests in this submit.
Let me changed back to 30 and 15 to check the flake again.
| // longer than 2 seconds (we change gracePeriodSeconds to 2 seconds here, and it's less than sleep action) | ||
| // shorter than sleep action (to make sure it doesn't take effect) | ||
| if !validDuration(cost, 2, 15) { | ||
| if !validDuration(cost, 5, 30) { |
There was a problem hiding this comment.
I think 15 makes more sense here - 30 is close enough to the original sleep that a little bit of nondeterminism could flake this test. 15 is 3x the expected value (~5) and 1/2 the requested value (30). If we hit 15 something went wrong, no?
|
Thanks! /lgtm |
|
LGTM label has been added. DetailsGit tree hash: 83f435f7069eb197628bf2f0d0835c29432a2dfb |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: AxeZhan, SergeyKanzhelev, thockin The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
Are we updating the website for this feature graduation? |
kubernetes/website#48295 is a placeholder. |
| lifecycle := &v1.Lifecycle{ | ||
| PreStop: &v1.LifecycleHandler{ | ||
| Sleep: &v1.SleepAction{Seconds: 15}, | ||
| Sleep: &v1.SleepAction{Seconds: 30}, |
There was a problem hiding this comment.
why did we change the test values during the graduation? we should avoid this things, or we modify the test first and wait some time to check its stability or we graduate and we modify the values later,
There was a problem hiding this comment.
From what I can tell, this test was not run with periodics. I don't see a dedicated job for this feature. It was added to pull-kubernetes-e2e-gce-cos-alpha-features.
I think maybe we should pull back this PR from promotion and make sure there is a periodic that verifies that this test is working.
There was a problem hiding this comment.
Sorry for the flake😢.
From what I can tell, this test was not run with periodics. I don't see a dedicated job for this feature. It was added to pull-kubernetes-e2e-gce-cos-alpha-features.
Yes, I thought pull-kubernetes-e2e-gce-cos-alpha-features will run periodicity before, seems it will only run with submit.
why did we change the test values during the graduation?
The current test parameters showed flakiness with this PR submission (it didn't occur in pull-kubernetes-e2e-gce-cos-alpha-features, which I suspect is related to the cluster used in CI runs).
I guess we'll need to modify the tests first, and we may graduate this feature to GA in next release.
I opened #128642 for testing.
But I'm not sure how to make these tests run periodicity, can I just remove the feature in SIGDescribe ? Or should I somehow add a job in test-infra?
There was a problem hiding this comment.
no worries, this things happen to all of us ...
if the feature is beta enabled by default you can just remove the features tag
|
This is flaking really badly since merge - https://storage.googleapis.com/k8s-triage/index.html?date=2024-11-06&pr=1&text=unexpected%20delay%20duration%20before%20killing%20the%20pod just hit it in another PR |
|
The PR originally changed some of the timeouts, I asked to revert those parts. Mea culpa - this flaking indicates that our ability to kill a pod "on time" could be off by more than 10 seconds. @AxeZhan do you want to bump those again (sorry!!)? |
What type of PR is this?
/kind feature
What this PR does / why we need it:
Graduate PodLifecycleSleepAction to GA
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: