osd/OSD: osd_fast_shutdown_notify_mon not quite right#44807
osd/OSD: osd_fast_shutdown_notify_mon not quite right#44807yuriw merged 2 commits intoceph:masterfrom
Conversation
93378e3 to
7f71ae1
Compare
|
Changed to add check also for osd_fast_shutdown as well as osd_fast_shutdown_notify_mon before mark as dead |
4b675f6 to
2c5d957
Compare
jdurgin
left a comment
There was a problem hiding this comment.
This looks right, however it seems set_state(STOPPING); should precede the MarkMeDead message, since that's what stops us from accepting new connections and processing new messages.
Since the is_stopping_cond is waiting with a timeout, STATE_STOPPING may not be set yet at this point.
2c5d957 to
c2892ca
Compare
|
I tested the functionality of this patch. |
|
Note also @mlausch 's comment in the tracker. I see two options:
@jdurgin ? @neha-ojha ? |
|
@ronen-fr I'd suggest the 1st approach - waiting for the ack - to keep it simple. The speed of shutting down is mainly a concern during an upgrade. It's worse to not get marked down and affect client i/o than to shutdown a little bit slower to be sure the message gets to the mon. |
|
@NitzanMordhai pls add |
I agree that waiting for an ack mechanism is probably the better way to go. Also, |
I think we should change that default - there's very little impact outside of an edge case when mons are unresponsive. |
Agreed, how about we cherry-pick the commit from #44016 in this PR and test them together? @satoru-takeuchi does that sound good to you? |
|
@neha-ojha Yes, sounds great. |
Thanks @satoru-takeuchi! @NitzanMordhai please cherry-pick bf4d358 into this PR and keep the original |
c2892ca to
44761f2
Compare
44761f2 to
985a90b
Compare
|
HI @NitzanMordhai. I tried out the current code. The Mon skipps the new Message type, because it is seems not be allowed send by a OSD. |
|
works for me as well. |
src/osd/OSD.cc
Outdated
| whoami, | ||
| osdmap->get_addrs(whoami), | ||
| osdmap->get_epoch(), | ||
| true // request ack |
There was a problem hiding this comment.
is there a reason to not mark the osd dead during a graceful shutdown?
There was a problem hiding this comment.
is there a reason to not mark the osd dead during a graceful shutdown?
I don't think there is any reason not, i couldn't find any
There was a problem hiding this comment.
let's make it unconditional then
|
@neha-ojha can you add a backport lable for pacific as well? |
The associated tracker has pacific and quincy in the backport field, https://tracker.ceph.com/issues/53327#note-3. I added the additional |
When osd_fast_shutdown and osd_fast_shutdown_notify_mon set as true, OSD marked as Down it should be marked as Dead, Fixed: https://tracker.ceph.com/issues/53327 Signed-off-by: Nitzan Mordechai <nmordech@redhat.com> nd nd
8d65ce8 to
07302d5
Compare
|
Failures, unrelated, tracked by: Other unrelated failures include http://pulpito.front.sepia.ceph.com/yuriw-2022-03-24_16:44:32-rados-wip-yuri-testing-2022-03-24-0726-distro-default-smithi/6758316/, which is a rook test, and http://pulpito.front.sepia.ceph.com/yuriw-2022-03-24_16:44:32-rados-wip-yuri-testing-2022-03-24-0726-distro-default-smithi/6758319/, an upgrade test. http://pulpito.front.sepia.ceph.com/yuriw-2022-03-24_16:44:32-rados-wip-yuri-testing-2022-03-24-0726-distro-default-smithi/6758455/ has been analyzied by Sridhar and was deemed unrelated. |
Modify test_activate_osd() to get the type of scheduler in use and then verify the value of osd_max_backfills. This is because mclock scheduler overrides this option to 1000 upon OSD initialization. The test earlier used to pass because the OSD daemon was killed but not marked down and upon being brought up, the wait for OSD up check was passing quickly. But the OSD still didn't have the latest config values. But now upon killing the OSD, the osd_fast_shutdown sequence notifies the mon (see PR: ceph#44807) and is marked down and dead. Upon bringing it up, the wait for OSD up check takes a longer time and this is sufficient for the config values to be updated. This results in the correct values being read from the config 'Values' map. Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
Modify test_activate_osd() to get the type of scheduler in use and then verify the value of osd_max_backfills. This is because mclock scheduler overrides this option to 1000 upon OSD initialization. The test earlier used to pass because the OSD daemon was killed but not marked down and upon being brought up, the wait for OSD up check was passing quickly. But the OSD still didn't have the latest config values. But now upon killing the OSD, the osd_fast_shutdown sequence notifies the mon (see PR: ceph#44807) and is marked down and dead. Upon bringing it up, the wait for OSD up check takes a longer time and this is sufficient for the config values to be updated. This results in the correct values being read from the config 'Values' map. Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com> (cherry picked from commit 3aa2df2)
Modify test_activate_osd() to get the type of scheduler in use and then verify the value of osd_max_backfills. This is because mclock scheduler overrides this option to 1000 upon OSD initialization. The test earlier used to pass because the OSD daemon was killed but not marked down and upon being brought up, the wait for OSD up check was passing quickly. But the OSD still didn't have the latest config values. But now upon killing the OSD, the osd_fast_shutdown sequence notifies the mon (see PR: ceph#44807) and is marked down and dead. Upon bringing it up, the wait for OSD up check takes a longer time and this is sufficient for the config values to be updated. This results in the correct values being read from the config 'Values' map. Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com> (cherry picked from commit 3aa2df2)
Modify test_activate_osd() to get the type of scheduler in use and then verify the value of osd_max_backfills. This is because mclock scheduler overrides this option to 1000 upon OSD initialization. The test earlier used to pass because the OSD daemon was killed but not marked down and upon being brought up, the wait for OSD up check was passing quickly. But the OSD still didn't have the latest config values. But now upon killing the OSD, the osd_fast_shutdown sequence notifies the mon (see PR: ceph#44807) and is marked down and dead. Upon bringing it up, the wait for OSD up check takes a longer time and this is sufficient for the config values to be updated. This results in the correct values being read from the config 'Values' map. Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com> (cherry picked from commit 3aa2df2)
Modify test_activate_osd() to get the type of scheduler in use and then verify the value of osd_max_backfills. This is because mclock scheduler overrides this option to 1000 upon OSD initialization. The test earlier used to pass because the OSD daemon was killed but not marked down and upon being brought up, the wait for OSD up check was passing quickly. But the OSD still didn't have the latest config values. But now upon killing the OSD, the osd_fast_shutdown sequence notifies the mon (see PR: ceph#44807) and is marked down and dead. Upon bringing it up, the wait for OSD up check takes a longer time and this is sufficient for the config values to be updated. This results in the correct values being read from the config 'Values' map. Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
Modify test_activate_osd() to get the type of scheduler in use and then verify the value of osd_max_backfills. This is because mclock scheduler overrides this option to 1000 upon OSD initialization. The test earlier used to pass because the OSD daemon was killed but not marked down and upon being brought up, the wait for OSD up check was passing quickly. But the OSD still didn't have the latest config values. But now upon killing the OSD, the osd_fast_shutdown sequence notifies the mon (see PR: ceph#44807) and is marked down and dead. Upon bringing it up, the wait for OSD up check takes a longer time and this is sufficient for the config values to be updated. This results in the correct values being read from the config 'Values' map. Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com> (cherry picked from commit 3aa2df2)
Modify test_activate_osd() to get the type of scheduler in use and then verify the value of osd_max_backfills. This is because mclock scheduler overrides this option to 1000 upon OSD initialization. The test earlier used to pass because the OSD daemon was killed but not marked down and upon being brought up, the wait for OSD up check was passing quickly. But the OSD still didn't have the latest config values. But now upon killing the OSD, the osd_fast_shutdown sequence notifies the mon (see PR: ceph#44807) and is marked down and dead. Upon bringing it up, the wait for OSD up check takes a longer time and this is sufficient for the config values to be updated. This results in the correct values being read from the config 'Values' map. Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
Modify test_activate_osd() to get the type of scheduler in use and then verify the value of osd_max_backfills. This is because mclock scheduler overrides this option to 1000 upon OSD initialization. The test earlier used to pass because the OSD daemon was killed but not marked down and upon being brought up, the wait for OSD up check was passing quickly. But the OSD still didn't have the latest config values. But now upon killing the OSD, the osd_fast_shutdown sequence notifies the mon (see PR: ceph#44807) and is marked down and dead. Upon bringing it up, the wait for OSD up check takes a longer time and this is sufficient for the config values to be updated. This results in the correct values being read from the config 'Values' map. Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com> (cherry picked from commit 3aa2df2)
When osd_fast_shutdown and osd_fast_shutdown_notify_mon set as true, OSD marked as Down
it should be marked as Dead,
Fixed: https://tracker.ceph.com/issues/53327
Signed-off-by: Nitzan Mordechai nmordech@redhat.com
Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume tox