cephadm,msg: ensure msgr address is unique when we have an init in our container#39739
cephadm,msg: ensure msgr address is unique when we have an init in our container#39739liewegas merged 3 commits intoceph:masterfrom
Conversation
7e1cc91 to
02ff19c
Compare
f760336 to
2d3f64f
Compare
|
https://pulpito.ceph.com/sage-2021-02-27_23:26:44-rados:cephadm:thrash-wip-sage3-testing-2021-02-27-1555-distro-basic-smithi/ |
2d3f64f to
00b3a3f
Compare
|
jenkins test make check |
|
retest this please |
|
we probably need to backport this. @travisn this might also be important for Rook! |
src/msg/Messenger.cc
Outdated
| { | ||
| uint64_t nonce = getpid(); | ||
| if (nonce == 1) { | ||
| if (nonce <= 10 || getenv("CEPH_CONTAINER_HAS_INIT")) { |
There was a problem hiding this comment.
| if (nonce <= 10 || getenv("CEPH_CONTAINER_HAS_INIT")) { | |
| if (nonce <= 10 || getenv("CONTAINER_IMAGE")) { |
(nit) rook and cephadm currently set CONTAINER_IMAGE ... maybe that would simply things?
There was a problem hiding this comment.
I thought about that, but (1) it seems possible that this variable is set and we're not in a container, and (2) we might be in a container and not have an init process, in which case the pid behavior is still appropriate.
We could do CEPH_USE_RANDOM_NONCE instead though!
There was a problem hiding this comment.
updated. also switched it back to pid == 1
There was a problem hiding this comment.
should we always set CEPH_USE_RANDOM_NONCE (regardless of if an init is used)?
This reverts commit 9200b1e, reversing changes made to e42bbba. For running tests to narrow down the root cause of: https://tracker.ceph.com/issues/49237 Signed-off-by: Michael Fritch <mfritch@suse.com>
If we are in a container, then we do not have a unique pid, and need to use a random nonce. We normally detect this if our pid is 1, but that doesn't work when we have a init process--we'll (probably?) have a small pid (in my tests, the OSDs were getting pid 7). To be safe, also check for an environment variable set by cephadm. This avoids problems that arise when we don't have a unique address. Fixes: https://tracker.ceph.com/issues/49534 Signed-off-by: Sage Weil <sage@newdream.net>
This ensures that daemon messenger nonces don't collide by using PIDs that are no longer unique for the IP address. Signed-off-by: Sage Weil <sage@newdream.net>
@sebastian-philipp Rook runs the ceph daemons as PID 1, so I believe we're actually fine unless there is another implication I'm missing. |
Interesting! I thought Rook uses tini. Just make sure the MGR is not accumulating zombies. |
Yes, in the past we were using tini, but it was removed since the conversion to take the rook container out of the ceph daemon pods. Does ceph not handle them as pid 1? If not, which daemons would this affect? I imagine the mgr with the modules would be mostly affected, although I haven't heard of any regression in this regard since that change. |
|
you shoudn't be able to get coredumps though. |
@travisn How do you get coredumps from the daemons when they crash, without a parent init process? |
Each container still has a parent from the host namespace, which is the container engine process. So we can coredumps normally just like any daemons most of the time in |
Sounds like the missing coredumps issue on the cephadm side is a "feature" of podman, then. |
@sebastian-philipp That's in a rook cluster? Could you open a rook issue with any details you observed or repro steps? thanks |
cephadm |
We normally detect we're in a container by checking for our pid being 1, but when we have an init process, that doesn't work.. our pid will be small (e.g., 7). Ensure we choose a random nonce in such situations.