Use non-blocking send and retry for exit events#3540
Use non-blocking send and retry for exit events#3540crosbymichael merged 4 commits intocontainerd:masterfrom
Conversation
5f1fb52 to
bcb6d8a
Compare
|
Build succeeded.
|
bcb6d8a to
dc83255
Compare
|
Build succeeded.
|
|
I've been running the repo case in the original PR and I'm unable to reproduce with these changes. I'll keep it go for now though |
dc83255 to
630b483
Compare
|
Build succeeded.
|
db8e36e to
9d0ac80
Compare
|
Build succeeded.
|
|
Build succeeded.
|
|
CRI plugin doesn't reply on So LG. |
@Random-Liu - If I could +1k I would... |
|
This looks more reasonable to send exit event 👍 |
shim.Reap and shim.Default.Wait may deadlock, use Monitor.Notify to fix this issue. Signed-off-by: Shukui Yang <keloyangsk@gmail.com>
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
36f2712 to
40bd882
Compare
|
Build succeeded.
|
40bd882 to
7151155
Compare
|
Build succeeded.
|
7151155 to
957b7a0
Compare
|
Build succeeded.
|
|
There are still some issues with this change that I am working through. Its not robust yet to merge |
|
Ok, this is solid now and ready for final reviews |
|
This change don't resolve the problem,in my centos7, It's easy to reproduce. The reason is the code call <-timer.C twice, add the following change can fix. |
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
564021d to
2763639
Compare
|
@keloyang I incorporated your changes into this. |
|
Build succeeded.
|
|
Another tiny optimization, we can change bufferSize back to 32, 2048 is a waste. |
|
Thanks @keloyang to point it out! I think we should change channel size back to 32. LGTM And I tried to change We don't need to read config every time because most of process are not init process. I think we should file patch after this pr. diff --git a/runtime/v1/shim/service.go b/runtime/v1/shim/service.go
index 37eae7f..57f7438 100644
--- a/runtime/v1/shim/service.go
+++ b/runtime/v1/shim/service.go
@@ -515,17 +515,17 @@ func (s *Service) allProcesses() []process.Process {
}
func (s *Service) checkProcesses(e runc.Exit) {
- shouldKillAll, err := shouldKillAllOnExit(s.bundle)
- if err != nil {
- log.G(s.context).WithError(err).Error("failed to check shouldKillAll")
- }
for _, p := range s.allProcesses() {
if p.Pid() == e.Pid {
+ if ip, ok := p.(*process.Init); ok {
+ // Ensure all children are killed
+ shouldKillAll, err := shouldKillAllOnExit(s.bundle)
+ if err != nil {
+ log.G(s.context).WithError(err).Error("failed to check shouldKillAll")
+ }
- if shouldKillAll {
- if ip, ok := p.(*process.Init); ok {
- // Ensure all children are killed
+ if shouldKillAll {
if err := ip.KillAll(s.context); err != nil {
log.G(s.context).WithError(err).WithField("id", ip.ID()).
Error("failed to kill init's children")After change, the perf looks better. |
|
LGTM |
|
Lets merge this one and add PRs for the rest of the optimizations |
|
Cherry-picked to |
|
mark, looks fixed the known issue |


Closes #3529
Carry of #3529