katautils: fix shim v2 fail to work with libnetwork#1789
katautils: fix shim v2 fail to work with libnetwork#1789egernst merged 1 commit intokata-containers:masterfrom Ace-Tang:fix-v2-cnm
Conversation
detail how kata work with libnetwork 1. kata create a new netns 2. with EnterNS, kata change netns to the created one. 3. in pre-start hook, kata will re-exec libnetwork process libnetwork-setkey, and send self pid to it. libnetwork use /proc/pid/ns/net to find the netns kata use, and set veth into the netns. v1/v2 shim use the same way to create network, v1 can successful because EnterNS changed both current thread and main thread's netns. But use v2 shim, only changed current thread netns, main thread still use host netns, so it fails. Looks like v1 just lucky to be successful. In kata, `state.Pid` should be tid. Fixes: #1788 Signed-off-by: Ace-Tang <aceapril@126.com>
jodh-intel
left a comment
There was a problem hiding this comment.
lgtm
/cc @egernst, @sboeuf, @mcastelino, @amshinde.
|
/test |
Codecov Report
@@ Coverage Diff @@
## master #1789 +/- ##
======================================
Coverage 54% 54%
======================================
Files 106 106
Lines 13170 13170
======================================
Hits 7113 7113
Misses 5210 5210
Partials 847 847 |
|
A backport to stable branches is needed for this PR. @Ace-Tang can you please submit this against stable-1.6 and stable-1.7 branches? |
|
@Ace-Tang How does this fix the problem? you mentioned that for the network namespace created, the tid is differerent from the pid. I think with shimv1 we explicitly used LockOSThread() to make sure the netns is created on the main thread so that pid and tid are the same. @sboeuf had worked on this iirc. |
|
@amshinde - the thread is associated with the correct netns. That path, using thread id, will resolve correctly to the /proc/pid/task..../ns/net |
|
@egernst Thanks for the explanation,I guess the thread entries are hidden when you do a Now that this is merged, we should also have a test for this, I dont think we test shimv2 today with CNM. |
|
|
@amshinde, as I posts in issue, I test this with pouch, since docker not support shim v2, but pouch does and pouch use libnetwork. So I simply run with pouch. |
|
goroutine within LockOSThread()/UnlockOSThread() is executed in a different os thread. And grpc server uses goroutines to serve requests. Therefore we cannot rely on |
detail how kata work with libnetwork
libnetwork-setkey, and send self pid to it. libnetwork use
/proc/pid/ns/net to find the netns kata use, and set veth into the netns.
v1/v2 shim use the same way to create network, v1 can successful
because EnterNS changed both current thread and main thread's netns.
But use v2 shim, only changed current thread netns, main thread still
use host netns, so it fails. Looks like v1 just lucky to be successful.
In kata,
state.Pidshould be tid.Fixes: #1788
Signed-off-by: Ace-Tang aceapril@126.com