Skip to content

[release/1.1] cherry-pick: Increase reaper buffer size#2762

Merged
estesp merged 2 commits intocontainerd:release/1.1from
estesp:cherrypick-exec-exit-chan-size-pr
Nov 13, 2018
Merged

[release/1.1] cherry-pick: Increase reaper buffer size#2762
estesp merged 2 commits intocontainerd:release/1.1from
estesp:cherrypick-exec-exit-chan-size-pr

Conversation

@estesp
Copy link
Member

@estesp estesp commented Nov 2, 2018

Cherry-pick #2748
Cherry-pick #2770

This increases the buffer size for process exit subscribers. It also
implements a non-blocking send on the subscriber channel. It is better
to drop an exit even than it is to block a shim for one slow subscriber.

Signed-off-by: Michael Crosby crosbymichael@gmail.com
Signed-off-by: Lantao Liu lantaol@google.com
Signed-off-by: Phil Estes estesp@linux.vnet.ibm.com

@codecov-io
Copy link

codecov-io commented Nov 2, 2018

Codecov Report

Merging #2762 into release/1.1 will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff              @@
##           release/1.1    #2762   +/-   ##
============================================
  Coverage        48.99%   48.99%           
============================================
  Files               85       85           
  Lines             7603     7603           
============================================
  Hits              3725     3725           
  Misses            3203     3203           
  Partials           675      675
Flag Coverage Δ
#linux 48.99% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a7dff7e...966fc1a. Read the comment docs.

@crosbymichael
Copy link
Member

LGTM

@Random-Liu
Copy link
Member

Random-Liu commented Nov 6, 2018

Enlarging the channel size seems fine to me, I'm a bit uncomfortable with discarding events. Init/exec process exit events come only once, and are very important for the container lifecycle. Discarding the event feels unsafe to me, and will cause state mismatch.

I think the sender should either block, or cache the event and retry.

I think #2765 to fix the deadlock seems a better fix.

@estesp
Copy link
Member Author

estesp commented Nov 6, 2018

any opinions @crosbymichael? We already have the events change in 1.2/master, so whatever we decide, if we change anything we should remember to "fix" master as well.

@Random-Liu
Copy link
Member

Random-Liu commented Nov 6, 2018

At least for CRI plugin, if the container exit event is discarded, the container won't be stopped any more. Only when user restarts containerd, CRI can recover state by listing/inspecting all the containers.

Our current assumption in CRI is that event is reliable, and we can purely rely on it. If that is not the case, we'll probably need to add periodically relisting to sync state for each container.

@crosbymichael
Copy link
Member

We could revert the change for dropping the send and keep the larger buffer size in addition to the #2765 change

@Random-Liu
Copy link
Member

We could revert the change for dropping the send and keep the larger buffer size in addition to the #2765 change

SGTM. :) Thanks a lot!

@Random-Liu
Copy link
Member

If so, I'll send partial fix to revert the event discard change.

For 1.1, it seems that we have all changes we want.

@crosbymichael
Copy link
Member

@Random-Liu go ahead and send a patch. I'm working on a fix for the pause and state issues right now that some have been seeing, the weird bug

@estesp
Copy link
Member Author

estesp commented Nov 8, 2018

Should I update this PR with just the buffer size change?

@Random-Liu
Copy link
Member

#2770 is in, you can probably update this change @estesp

crosbymichael and others added 2 commits November 10, 2018 20:45
Fixes containerd#2709

This increases the buffer size for process exit subscribers. It also
implements a non-blocking send on the subscriber channel.  It is better
to drop an exit even than it is to block a shim for one slow subscriber.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Signed-off-by: Lantao Liu <lantaol@google.com>
Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com>
@estesp estesp force-pushed the cherrypick-exec-exit-chan-size-pr branch from e2ac712 to 966fc1a Compare November 11, 2018 01:47
@estesp estesp changed the title [release/1.1] cherry-pick: Increase reaper buffer size and non-blocking send [release/1.1] cherry-pick: Increase reaper buffer size Nov 11, 2018
@estesp
Copy link
Member Author

estesp commented Nov 11, 2018

@Random-Liu updated!

@crosbymichael
Copy link
Member

LGTM

1 similar comment
@Random-Liu
Copy link
Member

LGTM

@thaJeztah
Copy link
Member

Looks like GitHub is a bit too eager hiding comments; here it's hiding "LGTM"'s

screen shot 2018-11-12 at 19 28 41

@estesp estesp merged commit 7cc1302 into containerd:release/1.1 Nov 13, 2018
@estesp estesp deleted the cherrypick-exec-exit-chan-size-pr branch November 13, 2018 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants