osd: add osd_fast_shutdown_notify_mon option (default false) by mfoliveira · Pull Request #38909 · ceph/ceph

mfoliveira · 2021-01-14T17:28:06Z

The osd_fast_shutdown option may cause the cluster log to receive
too many entries of osd.X reported immediately failed by osd.Y,
depending on cluster scale.

This might be an issue for LMA stacks/tools that check ceph logs
for failed lines, and then require additional logic to filter on
an intended OSD (fast) shutdown; might not be an option/possible,
and require an admin to analyze.

So, add osd_fast_shutdown_notify_mon option for OSD to also tell
the monitor it is shutting down (done in slow/non-fast shutdown)
under osd_fast_shutdown.

This introduces minimal delay (the ack from the mon is required
to prevent the messages), and addresses the cluster log issue.
Note: the osd_mon_shutdown_timeout option can be used to control
the maximum amount of time waiting for the monitor ack to arrive.

Fixes: http://tracker.ceph.com/issues/46978
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>

Testing

Note: for testing the option value has been changed to true, so to exercise the code path difference (otherwise it's a nop).

This passed run-make-check.sh (transient failures cleared w/ reruns: unittest_bluefs and unittest_bluefs).

And qa/run-standalone.sh reported 3 apparently unrelated failures (not rerun): osd/osd-bluefs-volume-ops.sh, scrub/osd-recovery-scrub.sh, scrub/osd-scrub-snaps.sh

Testing with vstart.sh (10 OSDs) shows that without the option (or false) the mon log gets multiple reports per OSD, and with the option (true) it remains quiet, as in the slow/non-fast shutdown case, as expected.

Before / or with osd_fast_shutdown_notify_mon = false:

osd log:

2021-01-09T18:59:52.448+0000 7f937fcdc700 -1 received  signal: Terminated from -bash  (PID: 408) UID: 1000
2021-01-09T18:59:52.448+0000 7f937fcdc700 -1 osd.2 22 *** Got signal Terminated ***
2021-01-09T18:59:52.448+0000 7f937fcdc700 -1 osd.2 22 *** Immediate shutdown (osd_fast_shutdown=true) ***

mon log:

$ cat out/mon.a.log | grep '^2021-01-09T18:59:' | grep 'osd.0 reported immediately failed by osd.' | rev | cut -d: -f1 | rev | sort | uniq -c
      4  osd.0 reported immediately failed by osd.1
      4  osd.0 reported immediately failed by osd.2
      4  osd.0 reported immediately failed by osd.3
      4  osd.0 reported immediately failed by osd.4
      4  osd.0 reported immediately failed by osd.5
      4  osd.0 reported immediately failed by osd.6
      4  osd.0 reported immediately failed by osd.7
      4  osd.0 reported immediately failed by osd.8
      4  osd.0 reported immediately failed by osd.9

After / with osd_fast_shutdown_notify_mon = true:

osd log:

2021-01-14T17:21:10.825+0000 7feceded1700 -1 received  signal: Terminated from -bash  (PID: 1750) UID: 1000
2021-01-14T17:21:10.825+0000 7feceded1700 -1 osd.0 80 *** Got signal Terminated ***
2021-01-14T17:21:10.825+0000 7feceded1700 -1 osd.0 80 *** Immediate shutdown (osd_fast_shutdown=true) ***
2021-01-14T17:21:10.825+0000 7feceded1700  0 osd.0 80 prepare_to_stop telling mon we are shutting down
...
2021-01-14T17:21:11.021+0000 7fecdac0c700  0 osd.0 80 got_stop_ack starting shutdown
2021-01-14T17:21:11.021+0000 7feceded1700  0 osd.0 80 prepare_to_stop starting shutdown

mon log:

2021-01-14T17:21:10.829+0000 7f62fce61700  0 log_channel(cluster) log [INF] : osd.0 marked itself down
2021-01-14T17:21:10.885+0000 7f62ff666700  1 mon.a@0(leader).osd e80 do_prune osdmap full prune enabled
2021-01-14T17:21:10.889+0000 7f62ff666700  0 log_channel(cluster) log [WRN] : Health check failed: 1 osds down (OSD_DOWN)
2021-01-14T17:21:10.957+0000 7f62fb65e700  1 mon.a@0(leader).osd e81 e81: 10 total, 9 up, 10 in
2021-01-14T17:21:11.013+0000 7f62fb65e700  0 log_channel(cluster) log [DBG] : osdmap e81: 10 total, 9 up, 10 in

Checklist

References tracker ticket
Updates documentation if necessary
Includes tests for new functionality or reproducer for bug

Show available Jenkins commands

jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox

smithfarm · 2021-01-18T08:41:44Z

@neha-ojha @jdurgin @tchaikov The author is asking for this to be backported (all the way back to nautilus). Does it need a release note?

mfoliveira · 2021-01-26T14:17:16Z

@smithfarm

@neha-ojha @jdurgin @tchaikov The author is asking for this to be backported (all the way back to nautilus). Does it need a release note?

Is this something I can do, say, adding a note to PendingReleaseNotes ?

Thanks

smithfarm · 2021-01-26T15:05:21Z

Is this something I can do, say, adding a note to PendingReleaseNotes ?

Yes, I think so. At least, it would make sense to me if you added a commit for that to this PR.

The osd_fast_shutdown option may cause the cluster log to receive too many entries of 'osd.X reported immediately failed by osd.Y', depending on cluster scale. This might be an issue for LMA stacks/tools that check ceph logs for failed lines, and then require additional logic to filter on an intended OSD (fast) shutdown; might not be an option/possible, and require an admin to analyze. So, add osd_fast_shutdown_notify_mon option for OSD to also tell the monitor it is shutting down (done in slow/non-fast shutdown) under osd_fast_shutdown. This introduces minimal delay (the ack from the mon is required to prevent the messages), and addresses the cluster log issue. Note: the osd_mon_shutdown_timeout option can be used to control the maximum amount of time waiting for the monitor ack to arrive. Fixes: http://tracker.ceph.com/issues/46978 Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>

Let's add the ``osd_fast_shutdown_notify_mon`` option to PendingReleaseNotes so it is documented. Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>

mfoliveira · 2021-01-26T16:42:37Z

@smithfarm

Is this something I can do, say, adding a note to PendingReleaseNotes ?

Yes, I think so. At least, it would make sense to me if you added a commit for that to this PR.

Done; thanks!

mfoliveira · 2021-01-29T16:12:03Z

The failing checks don't seem to be related to the changes in this PR;
for ceph API tests the errors mention dashboard, and for make check
the log is no longer avaialable, but the changes previously passed locally.

mfoliveira · 2021-02-05T22:01:59Z

@smithfarm hey, sorry to bother -- just following up -- is the current timeframe not the best for reviews, or maybe this just has to wait longer because it's lower priority/impact? Thank you!

neha-ojha · 2021-02-06T01:12:50Z

@smithfarm hey, sorry to bother -- just following up -- is the current timeframe not the best for reviews, or maybe this just has to wait longer because it's lower priority/impact? Thank you!

We are currently focusing on our upcoming release, Pacific. We'll review this PR next week. Thanks.

neha-ojha · 2021-02-06T01:13:05Z

jenkins test make check

mfoliveira · 2021-02-08T13:34:20Z

@neha-ojha

We are currently focusing on our upcoming release, Pacific. We'll review this PR next week. Thanks.

Understand; thanks for clarifying!

neha-ojha · 2021-02-12T01:43:31Z

jenkins retest this please

neha-ojha · 2021-02-12T16:55:01Z

@liewegas do you seen any issues with this option (given that it is off by default)?

jdurgin · 2021-02-24T05:24:20Z

seems like this should be the default to me

neha-ojha

let's start with this as a non-default option

hillpd · 2021-02-25T18:01:42Z

src/osd/OSD.cc

@@ -4154,6 +4154,8 @@ int OSD::shutdown()
 {
  if (cct->_conf->osd_fast_shutdown) {
    derr << "*** Immediate shutdown (osd_fast_shutdown=true) ***" << dendl;


A quick suggestion: log the three relevant values: osd_fast_shutdown, osd_fast_shutdown_notify_mon, and osd_mon_shutdown_timeout here. Perhaps even shift this outside of the osd_fast_shutdown clause.

When reading the logs, it would be helpful to quickly know which shutdown scenario applies. This could also help catch instances where settings were modified between the logged issue and the sosreport capture.

That said, I don't expect these settings will be adjusted frequently. There is also enough logging elsewhere to deduce which settings were applied at run-time. So, I'm just leaving this as a suggested improvement.

mfoliveira · 2021-03-05T18:25:29Z

Per SubmittingPatches-backports.rst:

Once the master PR has been merged, after checking that the change really needs to be backported (...)

Can someone please confirm it's OK to go ahead with backports of this down to Nautilus? Thanks!

satoru-takeuchi · 2021-11-17T21:51:03Z

@neha-ojha @liewegas

let's start with this as a non-default option

I'd like to enable this option because I found some I/O delays possibly caused by this problem. Is there any potential risk of enabling this option? In other words, why is this option still false by default? It looks like that notifying mon when voluntary OSD shutdown should be done unconditionally.

neha-ojha · 2021-11-18T15:36:36Z

@neha-ojha @liewegas

let's start with this as a non-default option

I'd like to enable this option because I found some I/O delays possibly caused by this problem. Is there any potential risk of enabling this option? In other words, why is this option still false by default? It looks like that notifying mon when voluntary OSD shutdown should be done unconditionally.

Yes, we should turn it on by default! Would you like to open a PR for it?

satoru-takeuchi · 2021-11-18T20:18:27Z

@neha-ojha Got it. I'll do it soon.

neha-ojha · 2021-11-18T23:11:44Z

@neha-ojha Got it. I'll do it soon.

Please link this ticket https://tracker.ceph.com/issues/53329 in your commit for backporting purposes.

satoru-takeuchi · 2021-11-19T01:12:09Z

I'd already created #44016 that has a link to issue 53328. Should I close issue 53328 and change link to issue 53329?

neha-ojha · 2021-11-19T01:20:00Z

I'd already created #44016 that has a link to issue 53328. Should I close issue 53328 and change link to issue 53329?

No need, I merged them, thanks!

github-actions bot added common core labels Jan 14, 2021

smithfarm requested review from jdurgin, neha-ojha and tchaikov January 18, 2021 08:41

Mauricio Faria de Oliveira added 2 commits January 26, 2021 12:56

PendingReleaseNotes: document option osd_fast_shutdown_notify_mon

7f5aaef

Let's add the ``osd_fast_shutdown_notify_mon`` option to PendingReleaseNotes so it is documented. Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>

github-actions bot added the documentation label Jan 26, 2021

neha-ojha approved these changes Feb 25, 2021

View reviewed changes

neha-ojha added the needs-qa label Feb 25, 2021

neha-ojha requested a review from liewegas February 25, 2021 00:57

hillpd reviewed Feb 25, 2021

View reviewed changes

liewegas added wip-sage2-testing wip-sage4-testing and removed wip-sage2-testing labels Mar 1, 2021

liewegas merged commit 5290ed3 into ceph:master Mar 3, 2021

mfoliveira mentioned this pull request Mar 9, 2021

pacific: osd: add osd_fast_shutdown_notify_mon option (default false) #39957

Merged

This was referenced Mar 10, 2021

octopus: osd: add osd_fast_shutdown_notify_mon option (default false) #40013

Merged

nautilus: osd: add osd_fast_shutdown_notify_mon option (default false) #40014

Merged

Conversation

mfoliveira commented Jan 14, 2021

Testing

Checklist

Uh oh!

smithfarm commented Jan 18, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mfoliveira commented Jan 26, 2021

Uh oh!

smithfarm commented Jan 26, 2021

Uh oh!

mfoliveira commented Jan 26, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mfoliveira commented Jan 29, 2021

Uh oh!

mfoliveira commented Feb 5, 2021

Uh oh!

neha-ojha commented Feb 6, 2021

Uh oh!

neha-ojha commented Feb 6, 2021

Uh oh!

mfoliveira commented Feb 8, 2021

Uh oh!

neha-ojha commented Feb 12, 2021

Uh oh!

neha-ojha commented Feb 12, 2021

Uh oh!

jdurgin commented Feb 24, 2021

Uh oh!

neha-ojha left a comment

Choose a reason for hiding this comment

Uh oh!

hillpd Feb 25, 2021

Choose a reason for hiding this comment

Uh oh!

hillpd Feb 25, 2021

Choose a reason for hiding this comment

Uh oh!

mfoliveira commented Mar 5, 2021

Uh oh!

satoru-takeuchi commented Nov 17, 2021

Uh oh!

neha-ojha commented Nov 18, 2021

Uh oh!

satoru-takeuchi commented Nov 18, 2021

Uh oh!

neha-ojha commented Nov 18, 2021

Uh oh!

satoru-takeuchi commented Nov 19, 2021

Uh oh!

neha-ojha commented Nov 19, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

smithfarm commented Jan 18, 2021 •

edited

Loading

mfoliveira commented Jan 26, 2021 •

edited

Loading