Skip to content

[release/1.6] metrics/cgroups: fix deadlock issue in Add during Collect#6801

Merged
kzys merged 1 commit intocontainerd:release/1.6from
fuweid:cherry-pick-6772
Apr 11, 2022
Merged

[release/1.6] metrics/cgroups: fix deadlock issue in Add during Collect#6801
kzys merged 1 commit intocontainerd:release/1.6from
fuweid:cherry-pick-6772

Conversation

@fuweid
Copy link
Copy Markdown
Member

@fuweid fuweid commented Apr 11, 2022

The Collector.Collect will be the field ns'Collect's callback, which be
invoked periodically with internal lock. And Collector.Add also runs
with ns.Lock in Collector.Lock, which is easy to cause deadlock.

Goroutine X:

ns.Collect
  ns.Lock
    Collector.Collect
      Collector.RLock

Goroutine Y:

Collector.Add
  Collector.Lock
    ns.Lock

We should use ns.Lock without Collector.Lock in Add.

Fix: #6772

Signed-off-by: Wei Fu fuweid89@gmail.com
(cherry picked from commit 8a1280b)
Signed-off-by: Wei Fu fuweid89@gmail.com


🍒 from #6788
And updated Stats mock interface because of the protobuf dependency.

diff --git a/metrics/cgroups/metrics_test.go b/metrics/cgroups/metrics_test.go
index c362ea3b9..c71ea60a5 100644
--- a/metrics/cgroups/metrics_test.go
+++ b/metrics/cgroups/metrics_test.go
@@ -32,7 +32,7 @@ import (
        v2 "github.com/containerd/containerd/metrics/cgroups/v2"
        v1types "github.com/containerd/containerd/metrics/types/v1"
        v2types "github.com/containerd/containerd/metrics/types/v2"
-       "github.com/containerd/containerd/protobuf"
+       "github.com/containerd/typeurl"
        "github.com/prometheus/client_golang/prometheus"
 
        metrics "github.com/docker/go-metrics"
@@ -152,7 +152,7 @@ func (t *mockStatT) Namespace() string {
 
 func (t *mockStatT) Stats(context.Context) (*types.Any, error) {
        if t.isV1 {
-               return protobuf.MarshalAnyToProto(&v1types.Metrics{})
+               return typeurl.MarshalAny(&v1types.Metrics{})
        }
-       return protobuf.MarshalAnyToProto(&v2types.Metrics{})
+       return typeurl.MarshalAny(&v2types.Metrics{})
 }

The Collector.Collect will be the field ns'Collect's callback, which be
invoked periodically with internal lock. And Collector.Add also runs
with ns.Lock in Collector.Lock, which is easy to cause deadlock.

Goroutine X:

	ns.Collect
	  ns.Lock
	    Collector.Collect
	      Collector.RLock

Goroutine Y:

	Collector.Add
	  Collector.Lock
	    ns.Lock

We should use ns.Lock without Collector.Lock in Add.

Fix: containerd#6772

Signed-off-by: Wei Fu <fuweid89@gmail.com>
(cherry picked from commit 8a1280b)
Signed-off-by: Wei Fu <fuweid89@gmail.com>
Copy link
Copy Markdown
Member

@estesp estesp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kzys kzys merged commit eed3a2a into containerd:release/1.6 Apr 11, 2022
@fuweid fuweid deleted the cherry-pick-6772 branch April 11, 2022 16:35
@uthark
Copy link
Copy Markdown
Contributor

uthark commented Apr 20, 2022

When do you plan 1.6.3 with the fix?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants