Skip to content

metrics/cgroups: fix deadlock issue in Add during Collect#6788

Merged
mikebrow merged 1 commit intocontainerd:mainfrom
fuweid:fix-issue-6772
Apr 11, 2022
Merged

metrics/cgroups: fix deadlock issue in Add during Collect#6788
mikebrow merged 1 commit intocontainerd:mainfrom
fuweid:fix-issue-6772

Conversation

@fuweid
Copy link
Copy Markdown
Member

@fuweid fuweid commented Apr 7, 2022

The Collector.Collect will be the field ns'Collect's callback, which be
invoked periodically with internal lock. And Collector.Add also runs
with ns.Lock in Collector.Lock, which is easy to cause deadlock.

Goroutine X:

ns.Collect
  ns.Lock
    Collector.Collect
      Collector.RLock

Goroutine Y:

Collector.Add
  Collector.Lock
    ns.Lock

We should use ns.Lock without Collector.Lock in Add.

Fix: #6772

Signed-off-by: Wei Fu fuweid89@gmail.com

@fuweid fuweid force-pushed the fix-issue-6772 branch 3 times, most recently from 87c0dd6 to 72258cd Compare April 7, 2022 16:26
@fuweid fuweid requested a review from AkihiroSuda April 7, 2022 16:30
@theopenlab-ci
Copy link
Copy Markdown

theopenlab-ci bot commented Apr 7, 2022

Build succeeded.

@thaJeztah
Copy link
Copy Markdown
Member

This doesn't affect v1.5? Or same issue there?

@fuweid
Copy link
Copy Markdown
Member Author

fuweid commented Apr 8, 2022

This doesn't affect v1.5? Or same issue there?

It was introduced by #5744 and released from 1.6 😃

@thaJeztah
Copy link
Copy Markdown
Member

Thanks! (just checking if I didn't have to 🍒⛏ for 1.5 as well ☺️)

Copy link
Copy Markdown
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

The Collector.Collect will be the field ns'Collect's callback, which be
invoked periodically with internal lock. And Collector.Add also runs
with ns.Lock in Collector.Lock, which is easy to cause deadlock.

Goroutine X:

	ns.Collect
	  ns.Lock
	    Collector.Collect
	      Collector.RLock

Goroutine Y:

	Collector.Add
	  Collector.Lock
	    ns.Lock

We should use ns.Lock without Collector.Lock in Add.

Fix: containerd#6772

Signed-off-by: Wei Fu <fuweid89@gmail.com>
@theopenlab-ci
Copy link
Copy Markdown

theopenlab-ci bot commented Apr 10, 2022

Build succeeded.

Copy link
Copy Markdown
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mikebrow mikebrow merged commit 449eb08 into containerd:main Apr 11, 2022
@fuweid fuweid deleted the fix-issue-6772 branch April 11, 2022 01:37
@alam0rt
Copy link
Copy Markdown
Contributor

alam0rt commented Apr 11, 2022

Thanks heaps for the quick response to this

@fuweid fuweid added cherry-picked/1.6.x PR commits are cherry-picked into release/1.6 branch and removed cherry-pick/1.6.x labels Apr 11, 2022
@uthark
Copy link
Copy Markdown
Contributor

uthark commented Apr 14, 2022

When do you plan to release 1.6.3 with the fix?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cherry-picked/1.6.x PR commits are cherry-picked into release/1.6 branch priority/P1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

containers stuck terminating / creating - many runc init processes

6 participants