osd: add osd_fast_shutdown_notify_mon option (default false)#38909
osd: add osd_fast_shutdown_notify_mon option (default false)#38909liewegas merged 2 commits intoceph:masterfrom mfoliveira:osd_fast_shutdown_notify_mon
Conversation
|
@neha-ojha @jdurgin @tchaikov The author is asking for this to be backported (all the way back to nautilus). Does it need a release note? |
Is this something I can do, say, adding a note to Thanks |
Yes, I think so. At least, it would make sense to me if you added a commit for that to this PR. |
The osd_fast_shutdown option may cause the cluster log to receive too many entries of 'osd.X reported immediately failed by osd.Y', depending on cluster scale. This might be an issue for LMA stacks/tools that check ceph logs for failed lines, and then require additional logic to filter on an intended OSD (fast) shutdown; might not be an option/possible, and require an admin to analyze. So, add osd_fast_shutdown_notify_mon option for OSD to also tell the monitor it is shutting down (done in slow/non-fast shutdown) under osd_fast_shutdown. This introduces minimal delay (the ack from the mon is required to prevent the messages), and addresses the cluster log issue. Note: the osd_mon_shutdown_timeout option can be used to control the maximum amount of time waiting for the monitor ack to arrive. Fixes: http://tracker.ceph.com/issues/46978 Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
Let's add the ``osd_fast_shutdown_notify_mon`` option to PendingReleaseNotes so it is documented. Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
Done; thanks! |
|
The failing checks don't seem to be related to the changes in this PR; |
|
@smithfarm hey, sorry to bother -- just following up -- is the current timeframe not the best for reviews, or maybe this just has to wait longer because it's lower priority/impact? Thank you! |
We are currently focusing on our upcoming release, Pacific. We'll review this PR next week. Thanks. |
|
jenkins test make check |
Understand; thanks for clarifying! |
|
jenkins retest this please |
|
@liewegas do you seen any issues with this option (given that it is off by default)? |
|
seems like this should be the default to me |
neha-ojha
left a comment
There was a problem hiding this comment.
let's start with this as a non-default option
| @@ -4154,6 +4154,8 @@ int OSD::shutdown() | |||
| { | |||
| if (cct->_conf->osd_fast_shutdown) { | |||
| derr << "*** Immediate shutdown (osd_fast_shutdown=true) ***" << dendl; | |||
There was a problem hiding this comment.
A quick suggestion: log the three relevant values: osd_fast_shutdown, osd_fast_shutdown_notify_mon, and osd_mon_shutdown_timeout here. Perhaps even shift this outside of the osd_fast_shutdown clause.
There was a problem hiding this comment.
When reading the logs, it would be helpful to quickly know which shutdown scenario applies. This could also help catch instances where settings were modified between the logged issue and the sosreport capture.
That said, I don't expect these settings will be adjusted frequently. There is also enough logging elsewhere to deduce which settings were applied at run-time. So, I'm just leaving this as a suggested improvement.
|
Per
Can someone please confirm it's OK to go ahead with backports of this down to Nautilus? Thanks! |
I'd like to enable this option because I found some I/O delays possibly caused by this problem. Is there any potential risk of enabling this option? In other words, why is this option still false by default? It looks like that notifying mon when voluntary OSD shutdown should be done unconditionally. |
Yes, we should turn it on by default! Would you like to open a PR for it? |
|
@neha-ojha Got it. I'll do it soon. |
Please link this ticket https://tracker.ceph.com/issues/53329 in your commit for backporting purposes. |
|
I'd already created #44016 that has a link to issue 53328. Should I close issue 53328 and change link to issue 53329? |
No need, I merged them, thanks! |
The
osd_fast_shutdownoption may cause the cluster log to receivetoo many entries of
osd.X reported immediately failed by osd.Y,depending on cluster scale.
This might be an issue for LMA stacks/tools that check ceph logs
for failed lines, and then require additional logic to filter on
an intended OSD (fast) shutdown; might not be an option/possible,
and require an admin to analyze.
So, add
osd_fast_shutdown_notify_monoption for OSD to also tellthe monitor it is shutting down (done in slow/non-fast shutdown)
under
osd_fast_shutdown.This introduces minimal delay (the ack from the mon is required
to prevent the messages), and addresses the cluster log issue.
Note: the
osd_mon_shutdown_timeoutoption can be used to controlthe maximum amount of time waiting for the monitor ack to arrive.
Fixes: http://tracker.ceph.com/issues/46978
Signed-off-by: Mauricio Faria de Oliveira
<mfo@canonical.com>Testing
Note: for testing the option value has been changed to
true, so to exercise the code path difference (otherwise it's a nop).This passed
run-make-check.sh(transient failures cleared w/ reruns:unittest_bluefsandunittest_bluefs).And
qa/run-standalone.shreported 3 apparently unrelated failures (not rerun):osd/osd-bluefs-volume-ops.sh,scrub/osd-recovery-scrub.sh,scrub/osd-scrub-snaps.shTesting with
vstart.sh(10 OSDs) shows that without the option (orfalse) the mon log gets multiple reports per OSD, and with the option (true) it remains quiet, as in the slow/non-fast shutdown case, as expected.osd_fast_shutdown_notify_mon = false:osd_fast_shutdown_notify_mon = true:Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume tox