pacific:mon/OSDMonitor: Added extra check before mon.go_recovery_stretch_mode()#48803
Conversation
Added bug reproducer for https://bugzilla.redhat.com/show_bug.cgi?id=2104207 Added more logs in MON. Signed-off-by: Kamoltat <ksirivad@redhat.com> (cherry picked from commit 62fe3cb)
Problem: There are certain scenarios in degraded stretched cluster where will try to go into the function ``Monitor::go_recovery_stretch_mode()`` that will lead to a `ceph_assert`. Solution: Make sure ``dead_mon_buckets.size() == 0`` in ``OSDMonitor:update_from_paxos()`` before going into ``Monitor::go_recovery_stretch_mode()``. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2104207 Signed-off-by: Kamoltat <ksirivad@redhat.com> (cherry picked from commit d95c41a)
|
jenkins test make check |
|
@kamoltat I found a failure that looks related. Can you take a look? /a/yuriw-2022-11-30_15:10:52-rados-wip-yuri3-testing-2022-11-28-0750-pacific-distro-default-smithi/7098562 |
|
This one also might be related: /a/yuriw-2022-11-29_15:35:32-rados-wip-yuri3-testing-2022-11-28-0750-pacific-distro-default-smithi/7097000/ |
|
And this one: /a/yuriw-2022-11-29_15:35:32-rados-wip-yuri3-testing-2022-11-28-0750-pacific-distro-default-smithi/7096933 It seems like there's a problem when the mons restart in some of the jobs. |
|
@yuriw merged this before I had a chance to investigate (my fault). I had seen some failures I thought were related in a previous test batch, and put it through another batch since I thought there were updates. But somehow this run looked fine... Here is the review: https://pulpito.ceph.com/?branch=wip-yuri2-testing-2022-12-07-0821-pacific Failures, unrelated: Details: |
|
This PR: #47340 We are in the process of fixing it |
…tch_mode()" This commit belongs to ceph#48803 which introduced https://tracker.ceph.com/issues/58239. Therefore, we are reverting it. This reverts commit 94dc970. Signed-off-by: Kamoltat <ksirivad@redhat.com>
This commit belongs to ceph#48803 which introduced https://tracker.ceph.com/issues/58239. Therefore, we are reverting it. This reverts commit 025d3fa. Signed-off-by: Kamoltat <ksirivad@redhat.com>
…tch_mode()" This commit belongs to ceph#48803 which introduced https://tracker.ceph.com/issues/58239. Therefore, we are reverting it. This reverts commit 94dc970. Fixes: https://tracker.ceph.com/issues/58239 Signed-off-by: Kamoltat <ksirivad@redhat.com>
This commit belongs to ceph#48803 which introduced https://tracker.ceph.com/issues/58239. Therefore, we are reverting it. This reverts commit 025d3fa. Fixes: https://tracker.ceph.com/issues/58239 Signed-off-by: Kamoltat <ksirivad@redhat.com>
This commit belongs to ceph/ceph#48803 which introduced https://tracker.ceph.com/issues/58239. Therefore, we are reverting it. This reverts commit 025d3fa. Fixes: https://tracker.ceph.com/issues/58239 Signed-off-by: Kamoltat <ksirivad@redhat.com>
…tch_mode()" This commit belongs to ceph/ceph#48803 which introduced https://tracker.ceph.com/issues/58239. Therefore, we are reverting it. This reverts commit 94dc970. Fixes: https://tracker.ceph.com/issues/58239 Signed-off-by: Kamoltat <ksirivad@redhat.com>
This commit belongs to ceph/ceph#48803 which introduced https://tracker.ceph.com/issues/58239. Therefore, we are reverting it. This reverts commit 025d3fa. Fixes: https://tracker.ceph.com/issues/58239 Signed-off-by: Kamoltat <ksirivad@redhat.com>
Problem:
There are certain scenarios in degraded
stretched cluster where will try to
go into the
function
Monitor::go_recovery_stretch_mode()that will lead to a
ceph_assert.Solution:
Make sure
dead_mon_buckets.size() == 0in
OSDMonitor:update_from_paxos()before going into
Monitor::go_recovery_stretch_mode().Fixes:
https://tracker.ceph.com/issues/57017
Backporting relevant commits from main PR:
#47340
Signed-off-by: Kamoltat ksirivad@redhat.com
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "pacific"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windows