osd: fix the scrubber behavior on multiple preemption attempts by ronen-fr · Pull Request #39145 · ceph/ceph

ronen-fr · 2021-01-28T17:28:11Z

Latest scrub code creates a time window in which a specific scrub
is marked as "preempted", but future preemptions are prohibited.
Write operations handled are then blocked but not restarted on time.

Fixes: https://tracker.ceph.com/issues/48793

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>

ronen-fr · 2021-01-28T17:38:22Z

jenkins retest this please

athanatos · 2021-01-28T21:22:16Z

src/osd/pg_scrubber.cc

 */
 void PgScrubber::add_delayed_scheduling()
 {
+  m_end = m_start; // not blocking any range now


I don't understand this one. Perhaps this should be set when we complete a chunk?

Yes to your question: this is what we would have done if we were able to complete the chunk. But as we were preempted:

There are two paths for preemption:

locally, which means we are either in state BuildMap or in WaitReplicas:
We won't be scrubbing anything, but we still maintain the object range.
Setting 'end' to 'start' below (l. 591) frees that block. But it's also solved by the other change - we no longer block preemption attempt
(but that path is slower).

a replica might signal in its returning message that it was preempted (and will not be providing a map).
The primary may still be working on creating its own map.
But then we notice the preemption, and while we
(unlike the original code) do not retry selecting a range immediately (we wait for all replicas, then spend time requeuing
vie PendingTimer), at least we allow multiple preemptions.
And - once we get to PendingTimer - there is no reason for us to maintain our hold on the old chunk

I don't understand this one. Perhaps this should be set when we complete a chunk?

m_start = m_end would be to complete a chunk

m_end = m_start when aborting/preempting the current chunk to stop blocking any range [m_start, m_end)

athanatos · 2021-01-28T21:22:28Z

src/osd/pg_scrubber.cc


      // signal the preemption
      preemption_data.do_preempt();
+      m_end = m_start; // free the range we were scrubbing


Is this the only caller of do_preempt()?

See previous comment.

src/osd/pg_scrubber.cc

Latest scrub code creates a time window in which a specific scrub is marked as "preempted", but future preemptions are prohibited. Write operations handled are then blocked but not restarted on time. Fixes: https://tracker.ceph.com/issues/48793 Signed-off-by: Ronen Friedman <rfriedma@redhat.com>

ronen-fr · 2021-01-29T10:11:59Z

https://pulpito.ceph.com/rfriedma-2021-01-28_20:14:16-rados-wip-ronenf-scrub-48793-distro-basic-smithi/
a small test run (filtered with 'thrash'). The one failure does not seem related. (?)

Waiting for longer tests.

dzafman · 2021-01-29T19:14:11Z

src/osd/pg_scrubber.cc

+    // otherwise - write requests arriving while 'already preempted' is set
+    // but 'preemptable' is not - will not be allowed to continue, and will
+    // not be requeued on time.
+    return false;


It feels dangerous to ignore m_start -> m_end blocking range, but I assume that we are guaranteeing that the current scrub chunk will be aborted. Somewhere m_end = m_start might happen before eventually restarting that chunk or a smaller chunk at m_start.

will verify. Thanks

I don't see a scenario in which we have a problem with modifying 'end' in that direction (i.e. to be equal to 'start').
The current chunk will be aborted. And until that happens:
even the backend building of the local map is using its own copy of the start and end markers.

neha-ojha · 2021-02-03T19:22:44Z

https://pulpito.ceph.com/nojha-2021-02-01_21:31:14-rados-wip-39145-distro-basic-smithi/ - no related failures or out of order issues in this run

neha-ojha · 2021-02-03T23:38:21Z

@dzafman @athanatos This PR fixes a bug which causes multiple dead jobs in every rados run. If you are happy with this version, I'd like to merge it.

dzafman

LGTM

osd: replacing list::size() with list::empty() in some scrub code

f100f2f

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>

github-actions bot added the core label Jan 28, 2021

ronen-fr marked this pull request as ready for review January 28, 2021 17:38

ronen-fr requested review from athanatos, dzafman, neha-ojha and tchaikov January 28, 2021 17:38

ronen-fr added the bug-fix label Jan 28, 2021

athanatos reviewed Jan 28, 2021

View reviewed changes

src/osd/pg_scrubber.cc Show resolved Hide resolved

ronen-fr force-pushed the wip-ronenf-scrub-48793 branch from a864dc4 to 4cdb293 Compare January 29, 2021 09:52

ronen-fr requested a review from athanatos January 29, 2021 10:10

dzafman reviewed Jan 29, 2021

View reviewed changes

dzafman self-requested a review January 29, 2021 19:26

neha-ojha added needs-qa wip-neha-testing labels Feb 1, 2021

neha-ojha removed needs-qa wip-neha-testing labels Feb 3, 2021

athanatos approved these changes Feb 3, 2021

View reviewed changes

neha-ojha merged commit 0dbc0b6 into ceph:master Feb 3, 2021

dzafman reviewed Feb 3, 2021

View reviewed changes

Conversation

ronen-fr commented Jan 28, 2021

Uh oh!

ronen-fr commented Jan 28, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ronen-fr commented Jan 29, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

neha-ojha commented Feb 3, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

neha-ojha commented Feb 3, 2021

Uh oh!

dzafman left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

neha-ojha commented Feb 3, 2021 •

edited

Loading