Skip to content

Conversation

@thaJeztah
Copy link
Member

@thaJeztah thaJeztah commented Oct 10, 2023


After pr #8617, create handler of containerd-shim-runc-v2 will
call handleStarted() to record the init process and handle its exit.
Although init process wouldn't quit so early in normal circumstances.
But if this screnario occurs, handleStarted() will call
handleProcessExit(), which will cause deadlock because create() had
acquired s.mu, and handleProcessExit() will try to lock it again.

So I add handleProcessExitNoLock() function which will not lock s.mu.
It can safely be called in create handler without deadlock.

@thaJeztah
Copy link
Member Author

Looks like we may be missing something in this branch;

  Running [/home/runner/golangci-lint-1.51.1-linux-amd64/golangci-lint run --out-format=github-actions --timeout=8m] in [] ...
  Error: : # github.com/containerd/containerd/integration/failpoint/cmd/runc-fp
  Error: integration/failpoint/cmd/runc-fp/main.go:93:19: undefined: oci.ReadSpec
  Error: integration/failpoint/cmd/runc-fp/main.go:93:32: undefined: oci.ConfigFilename
  Error: integration/failpoint/cmd/runc-fp/main.go:95:55: undefined: oci.ConfigFilename (typecheck)
  
  Error: issues found
  Ran golangci-lint in 131378ms

@thaJeztah
Copy link
Member Author

slonopotamus and others added 3 commits October 10, 2023 12:06
Signed-off-by: Marat Radchenko <marat@slonopotamus.org>
(cherry picked from commit 9e34b8b)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Previous code has already called `getContainer()`, just pass it into
`s.getContainerPids` to reduce unnecessary lock and map lookup.

Signed-off-by: Chen Yiyang <cyyzero@qq.com>
(cherry picked from commit 6604ff6)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
After pr containerd#8617, create handler of containerd-shim-runc-v2 will
call handleStarted() to record the init process and handle its exit.
Init process wouldn't quit so early in normal circumstances. But if
this screnario occurs, handleStarted() will call
handleProcessExit(), which will cause deadlock because create() had
acquired s.mu, and handleProcessExit() will try to lock it again.

So, I added a parameter muLocked to handleStarted to indicate whether
or not s.mu is currently locked, and thus deciding whether or not to
lock it when calling handleProcessExit.

Fix: containerd#9103
Signed-off-by: Chen Yiyang <cyyzero@qq.com>
(cherry picked from commit 68dd47e)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
@thaJeztah thaJeztah force-pushed the 1.7_backport_fix_deadlock branch from 2febc4d to 2eee3df Compare October 10, 2023 10:07
@dmcgowan
Copy link
Member

Should the new test be skipped for io.containerd.runtime.v1.linux as well?

Signed-off-by: Wei Fu <fuweid89@gmail.com>
(cherry picked from commit 11a7751)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
@thaJeztah thaJeztah force-pushed the 1.7_backport_fix_deadlock branch from 2eee3df to d0a1fed Compare October 10, 2023 14:52
Comment on lines +1513 to +1515
if getRuntimeVersion() == "v1" {
t.Skip("skip it when using shim v1")
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this (https://github.com/containerd/containerd/compare/2eee3df18421b95ccb7e5a750d1918a26d66b022..d0a1fedb5a4828daddff330a345780d0222e47e8)

thanks @fuweid (let's see if it goes green with this patch added); I'll update the 1.6 PR as well

@thaJeztah
Copy link
Member Author

/retest

Copy link
Member

@fuweid fuweid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@fuweid fuweid merged commit 5e21abb into containerd:release/1.7 Oct 11, 2023
@thaJeztah thaJeztah deleted the 1.7_backport_fix_deadlock branch October 11, 2023 07:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants