vendor: bump runc to rc94#101888
Conversation
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
|
Hi @kolyshkin. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dims, giuseppe, kolyshkin, liggitt The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest |
There was a problem hiding this comment.
Sorry to be late at the game, but didn't see this PR before it marked some of my PRs as Needs rebase.... :/
With this version of runc, we need to do some refactoring to properly enforce cgroup values. Since runc changed they way they enforce limits (by not setting cgroup resources in Apply), this breaks our use of the systemd cgroup driver. As discussed in opencontainers/runc#2813, a fix is needed in order to support it properly..
I have a PR to properly deal with systemd here: #98374.
I am however quite worried this PR (as in this PR, not mine) might cause regressions since both https://testgrid.k8s.io/sig-node-kubelet#node-kubelet-serial and https://testgrid.k8s.io/sig-node-kubelet#node-kubelet-conformance have been failing for quite some time. Those test suites are the only real ways to verify if these cause regressions. Did you run some of the tests locally (or on gce) @kolyshkin?
I think we should either try to merge the systemd-fix PR as soon as possible, and find some Googlers who can debug node-kubelet-serial tests so we can start run them again. I guess reverting this until we have working tests isn't an option, or?
| "RSSBytes": bounded(1*e2evolume.Kb, memoryLimit), | ||
| "PageFaults": bounded(0, 1000000), | ||
| "MajorPageFaults": bounded(0, 10), | ||
| "MajorPageFaults": bounded(0, 1000), |
There was a problem hiding this comment.
We should revisit this, since this is changing a conformance test, and results in different behavior between cg v1 and v2. I don't have it on top of my head, but I think this will always be 0 on cg v1 since the metric only counts page faults on the pod cgroup itself.
|
The runc bump was referenced in a comment about a performance regression in 1.21 (#101989), so I expect there will be a desire to take this version back to the 1.21 release branch. However, changing conformant behavior is a problem, so all the implications of this PR have to be understood and resolved before any backports to a release branch can be done. @SergeyKanzhelev @bobbypage what would be your expectation here?
Since this caused CI to turn red, my inclination is option 1. |
|
revert open in #102094 to get CI running |
|
Thanks for jumping in @liggitt !
Yeah, I think that was the plan yes. This is however quite a big dependency change, containing way more code and fixes than just the issue referenced here.
Yeah, that is a big issue. As far as I remember from working with these systemd-things last time, this change is effectively disabling all cgroup resource enforcement on pod ( The change in the conformance test should also be done in a separate PR imo., so that all involved sigs understands how and why.
I vote for this in short term, in order to avoid other devs having to debug these issues while working on other unrelated things. I do think we should get https://testgrid.k8s.io/sig-node-kubelet#node-kubelet-serial up running again, and get proper signaling from that test suite in order to properly see that things works as expected. Also, the fixes required to mitigate these issues are quite complex, so I think it is a bit dangerous to rush them. edit1: added the last paragraph. |
agreed. I asked in the issue about the performance regression how that can be resolved without taking rc94 back to release-1.21 (#101989 (comment)) |
|
Implication from
So roughly we're saying that nothing in presubmit e2e is verifying the systemd-cgroup driver? And that https://testgrid.k8s.io/sig-node-kubelet#node-kubelet-serial is the way to get this going again but is currently broken? If we can't guarantee that merges of runc don't break "resource management in kube", we definitely need that test (and others?) up and running ASAP, and potentially better ability to run those tests presubmit (as needed). |
Yeah, that's right as of now... But I 100% agree that it is a bad thing. I have been working for the last two years fixing failing tests in node-kubelet-serial, but since people usually don't look at it, it tends to fail, and issues and regressions are not properly detected. I have been busy with my master thesis for the last 7-8 months, and more and more of tests have started failing. When I have made runc bumps previously, I have always been running those tests myself, either locally or on my own gce account. Early in may, node-kubelet-serial started to go dark (as it is now), and I have not had time to dig more into the issue. Since it usually take a few hours to run it, it isn't suited for blocking pre submit. We could (and should) create an optional pre submit if we can get it to go green.
Yup! I have been using countless hours the last 2 years trying to keep it somewhat healthy (and this is not my full time job, just a side gig while at university), but I think some people with access to the gce account need to debug it as it is now. We should also run the same test suite for cgroup v2 as well, otherwise I really don't see how we can graduate it to beta or GA with confidence that it actually works as we expect it to do.
The tests that verify low level stuff like cgroup values etc. are all part of edit: Updated with a comment to address the updates from your comment. |
|
@smarterclayton #102092. We have node e2e jobs that verify that, but they aren't made blocking yet. We want to make them blocking, but we are still discussing if we should merge those crio jobs with existing docker job or make an independent blocking job, kubernetes/test-infra#21278 (comment) |
Yeah, that test suite have some tests (a few) that will verify metrics etc. All tests that deal with cgroups and low level kubelet stuff etc. have to be executed in serial, since they keep restarting kubelet etc., and those tests are all a part of |
Revert "Merge pull request #101888 from kolyshkin/update-runc-rc94"
|
Thanks @odinuge for raising concerns with the runc bump with and @liggitt for quick revert. It sounds like there's two issues so far with runc update that were found:
I'm not super clear on the second case. It looks like the current approach is we call /cc @mrunalp @kolyshkin |
I'll defer to @odinuge (xref #101888 (review)) |
|
Thanks! Yeah, I think that comment describes it, but these systemd stuff is pretty complecated. This is a good resource: https://systemd.io/CGROUP_DELEGATION/
With this PR, we never set systemd properties for resources like cpu and memory, we only write the cgroup files. Unless we use delegated subtrees (ref systemd docs), we should not write to those files. Systemd will overwrite them. Without this PR, we set systemd resources only at creation of systemd-slice/scope, that the PR I mentioned is fixing. #98374. Look at the issue that PR mark as "fixed" in order to understand the issue in current master. It essentially works as long as we don't change the values. To show this in real life, one can play with systemd like this: This is certainly behavior we don't would like to happen... I guess I can write a book about this, but but this comment make it easier to understand the issue @liggitt @bobbypage @kolyshkin? edit: sorry for the typos, it is getting late here |
|
Spent a day on it, not sure what is causing the issue, will continue tomorrow. |
Are we talking about the same things, or are you looking at some other issues @kolyshkin? I did some testing yesterday on my PR that fixes the systemd-stuff I mentioned in #101888 (comment) yesterday, and that does fix failing test @bobbypage mentiones. See #102092 (comment) for more info. Also see @harche's comment here: #98374 (comment) Here is a test run as pre submit: https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/98374/pull-kubernetes-node-crio-e2e/1394592396901093376/ |
What type of PR is this?
/kind bug
What this PR does / why we need it:
Which issue(s) this PR fixes:
Special notes for your reviewer:
runc v1.0.0-rc94 release notes: https://github.com/opencontainers/runc/releases/tag/v1.0.0-rc94
This also picks up a couple of commits from #100631.
Does this PR introduce a user-facing change?