Skip to content

cgroup, systemd: cleanup cgroups #2503

Merged
mrunalp merged 2 commits intoopencontainers:masterfrom
giuseppe:cgroup-fixes
Jul 6, 2020
Merged

cgroup, systemd: cleanup cgroups #2503
mrunalp merged 2 commits intoopencontainers:masterfrom
giuseppe:cgroup-fixes

Conversation

@giuseppe
Copy link
Copy Markdown
Member

@giuseppe giuseppe commented Jul 5, 2020

some hierarchies were created directly by .Apply() on top of systemd managed cgroups. systemd doesn't manage these and as a result we leak these cgroups.

Signed-off-by: Giuseppe Scrivano gscrivan@redhat.com

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
@tedyu
Copy link
Copy Markdown
Contributor

tedyu commented Jul 5, 2020

lgtm

AkihiroSuda
AkihiroSuda previously approved these changes Jul 6, 2020
Copy link
Copy Markdown
Member

@AkihiroSuda AkihiroSuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I'd like to see integration tests if possible

AkihiroSuda
AkihiroSuda previously approved these changes Jul 6, 2020
@giuseppe
Copy link
Copy Markdown
Member Author

giuseppe commented Jul 6, 2020

@AkihiroSuda is the CI failure a flake?

@AkihiroSuda
Copy link
Copy Markdown
Member

$ sudo ssh default -t 'cd /vagrant && sudo make localunittest'
sudo: make: command not found

Could you try restarting CI? If the error persists, make would need to be explicitly dnf-installed in Vagrantfile.

mrunalp
mrunalp previously approved these changes Jul 6, 2020
@kolyshkin
Copy link
Copy Markdown
Contributor

I was too naive to think lstat(2) is a cheap syscall, and so I wrote a benchmark for IsRunningSystemd, to see how much the first commit in this PR helps:

func BenchmarkIsRunningSystemd(b *testing.B) {
	var res bool
	for i := 0; i < b.N; i++ {
		res = IsRunningSystemd()
	}
	b.Logf("IsRunningSystemd: %v", res)
}

Before this PR:

BenchmarkIsRunningSystemd-8   	 1926032	       614 ns/op

After:

BenchmarkIsRunningSystemd-8   	637496562	         1.81 ns/op

Out of pure curiosity I also tried using unix.Lstat instead of os.Lstat:

func IsRunningSystemd() bool {
        var st unix.Stat_t
        err := unix.Lstat("/run/systemd/system", &st)
        if err != nil {
                return false
        }
        return st.Mode&unix.S_IFDIR == unix.S_IFDIR
}

and the result is roughly 20% savings:

BenchmarkIsRunningSystemd-8   	 2321263	       506 ns/op

So, yes, using sync.Once totally makes sense here.

@kolyshkin
Copy link
Copy Markdown
Contributor

CI failure is a glitch (or we should not rely on github to provide umoci binary):

Step 13/24 : RUN curl -o /usr/local/bin/umoci -fsSL https://github.com/opencontainers/umoci/releases/download/v0.4.5/umoci.amd64     && chmod +x /usr/local/bin/umoci
 ---> Running in e33fd55181d1
curl: (22) The requested URL returned error: 429 too many requests
The command '/bin/sh -c curl -o /usr/local/bin/umoci -fsSL https://github.com/opencontainers/umoci/releases/download/v0.4.5/umoci.amd64     && chmod +x /usr/local/bin/umoci' returned a non-zero code: 22
Makefile:62: recipe for target 'runcimage' failed

CI restarted

@kolyshkin
Copy link
Copy Markdown
Contributor

kolyshkin commented Jul 6, 2020

Alas, the test case is not catching the bug (I have reverted the fixing part):

$ git show libcontainer/cgroups/systemd/v1.go | patch -p1 -R
patching file libcontainer/cgroups/systemd/v1.go

$ sudo make localintegration TESTPATH=/delete.bats
[sudo] password for kir: 
go build "-mod=vendor" "-buildmode=pie"  -tags "seccomp selinux apparmor" -ldflags "-X main.gitCommit="2185e9c4aa7b3e91b14e1073176ec3ee296ff665-dirty" -X main.version=1.0.0-rc91+dev " -o runc .
go build "-mod=vendor" "-buildmode=pie"  -tags "seccomp selinux apparmor" -ldflags "-X main.gitCommit="2185e9c4aa7b3e91b14e1073176ec3ee296ff665-dirty" -X main.version=1.0.0-rc91+dev " -o contrib/cmd/recvtty/recvtty ./contrib/cmd/recvtty
bats -t tests/integration/delete.bats
1..4
ok 1 runc delete
ok 2 runc delete --force
ok 3 runc delete --force ignore not exist
ok 4 runc delete --force in cgroupv2 with subcgroups # skip test requires cgroups_v2

@giuseppe
Copy link
Copy Markdown
Member Author

giuseppe commented Jul 6, 2020

Alas, the test case is not catching the bug (I have reverted the fixing part):

are you using sudo make localintegration TESTPATH=/delete.bats RUNC_USE_SYSTEMD=1 on cgroup v1?

@kolyshkin
Copy link
Copy Markdown
Contributor

are you using sudo make localintegration TESTPATH=/delete.bats RUNC_USE_SYSTEMD=1 on cgroup v1?

Oh, I forgot RUNC_USE_SYSTEMD=1, most probably due to lack of espresso :-\

Yes, the test is working

@kolyshkin
Copy link
Copy Markdown
Contributor

The regression is caused by commit bfa1b2a (PR #2331).

some hierarchies were created directly by .Apply() on top of systemd
managed cgroups.  systemd doesn't manage these and as a result we leak
these cgroups.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Copy link
Copy Markdown
Contributor

@kolyshkin kolyshkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mrunalp mrunalp merged commit 30dc54a into opencontainers:master Jul 6, 2020
@kolyshkin kolyshkin mentioned this pull request Jul 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants