Skip to content

osd: Extracting scrub-related code from the PG & friends#38111

Merged
athanatos merged 10 commits intoceph:masterfrom
ronen-fr:wip-ronenf-scrub-v2
Dec 14, 2020
Merged

osd: Extracting scrub-related code from the PG & friends#38111
athanatos merged 10 commits intoceph:masterfrom
ronen-fr:wip-ronenf-scrub-v2

Conversation

@ronen-fr
Copy link
Contributor

Objectives:

- identify and extract scrub code, as a first step into sharing scrub code between classic
  and Crimson implementations;
- refactor the code as needed for clarity;

Main changes:

- PgScrubber (pg_scrubber.{h,cc}) implements the scrub code that resided in PG.cc. And:

- PrimaryLogScrub.{h,cc} now implement the parts of the scrub code that resided in PrimaryLogPG.
  (note that I didn't see much point in the separation, but changing that was outside the scope of this PR)

- The flags that govern scrub scheduling/operation were separated into two structures: one is maintained
  by the PG, and details the "next scrub" parameters. The other is PgScrubber-owned, and determines the
  operation options for the running scrub. This collection of flags is set (and is "frozen") at
  PG:queue_scrub(), following the logic of the existing code.

- the scrubbing is now implemented as an explicit state-machine.
- and the events that trigger state changes are now translated into separate specific messages, that
  are initiated using specific calls;
- the priorities used when queueing and handling those events were reworked, trying to match the original
  semantics.

- Scrub-related resources (the OSD-owned 'local' and 'remote' scrub resources) are managed using RAII
  wrappers, hopefully guaranteeing no leakage.

Design decisions:

- the new scrub FSM (in scrub_machine.*) does not contain PG-specific code. All code that requires knowledge
  of the PG internals (or PrimaryLogPG internals) is in PgScrubber & PrimaryLogScrub. That, to achieve:
  - easy switching from (the slow) boost::state_chart to something better (I promised the SML author to give
    it a second try, after he has incorporated some needed changes);
  - ease of adaptation to Crimson;
  - clearer separation of concerns.

This PR is a re-structured version of #37317, with the changes logically divided into separate commits for the reviewers' sake.

@dzafman
Copy link
Contributor

dzafman commented Nov 17, 2020

@ronen-fr The qa/standalone/scrub/osd-scrub-test.sh TEST_deep_scrub_abort() test fails on my build machine using this code at bc22448.

@dzafman
Copy link
Contributor

dzafman commented Nov 17, 2020

@neha-ojha @ronen-fr There is a race with setting noscrub or nodeep-scrub, see https://tracker.ceph.com/issues/47767. The PG::abort_scrub() function doesn't account for messages in-flight from scrub replicas. Wondering if you can deal with this better in the state machine version. Coincidentally, the failure above is exactly with the test setting nodeep-scrub.

@dzafman
Copy link
Contributor

dzafman commented Nov 17, 2020

@ronen-fr I started reviewing this version of the code. I'll be out tomorrow (Tue) back the next day.

@ronen-fr
Copy link
Contributor Author

There is a race with setting noscrub or nodeep-scrub, see https://tracker.ceph.com/issues/47767. The PG::abort_scrub() function doesn't account for messages in-flight from scrub replicas. Wondering if you can deal with this better in the state machine version. Coincidentally, the failure above is exactly with the test setting nodeep-scrub.

@dzafman, thanks for the comments.

Seems to me that the two issues are unrelated:

  • there is that race condition you mention. In the new code the chances of this scenario causing a crash is low:  PrimaryLogPG::do_replica_scrub_map() checks is_scrub_active() before calling PgScrubber::map_from_replica().

Now - there is still a theoretical issue if the primary terminated the scrub and started a new one while the message was in-flight. MapsCollectionStatus::mark_arriving_map() will then return 'false', causing map_from_replica()  to assert.  I would suggest removing that specific assert. I do not think it serves a positive purpose.

  • there is another consistent problem w/ ocb-scrub-test.sh:  setting 'ceph osd set $stopscrub' , for the deep-scrub case, does not prevent a regular scrub from starting immediately after the deep-scrub was aborted. The test does not cater for that. In my local tests, I have modified the deep-scrub case to also enable/disable regular scrubs. I would welcome a better suggestion.

…espace)

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
into a PrimaryLogScrub - a derivative of PgScrubber.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
Fixing Crimson compilation.

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
(following code-review comments) main changes:

- incoming scrub events are now validated by the scrubber before being sent to the
  SFM:
  - messages from a previous intervals are discarded, possibly signalling a
    scrub of the current scrub and full scrubber reset;
  - abort newly mandated by a configuration change is acted upon;
  - stale messages from previous scrub sessions are discarded;

- replica reservations are silently discarded on interval changes

- some modifications to the state diagram

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
@ronen-fr ronen-fr force-pushed the wip-ronenf-scrub-v2 branch from da2b688 to 0ac87eb Compare December 10, 2020 14:54
@tchaikov tchaikov dismissed their stale review December 10, 2020 15:17

dismissed

@tchaikov
Copy link
Contributor

@ronen-fr
Copy link
Contributor Author

Something strange is happening to our perf test...
"Started 1d 3h 30m 46s ago"

@ronen-fr
Copy link
Contributor Author

jenkins test classic perf

Copy link
Contributor

@athanatos athanatos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to go ahead and merge this version, please send a follow up PR handling the last few comments here.

// we have not seen this interval change yet.
// The remote reservations are no longer relevant.

m_last_dead_interval = current_interval;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this variable and related logic is entirely unnecessary. When we saw the map reflecting the interval change, we called PrimaryLogPG::on_change which called PGScrubber::clear_pgscrub_state which cleared the reservations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

turns out on_change() is too late in the process.
Looking now for the earliest location to hook the 'no messages' reservations clearing

}

// returns false if the message should be discarded. Handles the notification of interval
// change, if not done already. called only for active scrub? not sure.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please answer this question.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modified (separate PR)


if (epoch_to_verify < current_interval) {
// the event will not be delivered. If we have already noticed and handled
// the change of seasons, it will be silently discarded. Otherwise - we
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We must have already handled the interval change, or get_same_interval_since() wouldn't be updated yet.

@jdurgin
Copy link
Member

jdurgin commented Dec 15, 2020

1 similar comment
@jdurgin
Copy link
Member

jdurgin commented Dec 15, 2020

ronen-fr added a commit to ronen-fr/ceph that referenced this pull request Mar 25, 2022
Snap trimming that was postponed as the target PG was scrubbing
must be restarted at scrub completion.
PR ceph#38111 moved trimming restart to just before the scrub fully
terminated. Current PR fixes that.

Fixes: https://tracker.ceph.com/issues/52026

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
ronen-fr added a commit to ronen-fr/ceph that referenced this pull request Mar 25, 2022
Snap trimming that was postponed as the target PG was scrubbing
must be restarted at scrub completion.
PR ceph#38111 moved trimming restart to just before the scrub fully
terminated. The current PR fixes that.

Trimming is also restarted in those cases where scrub was
queued but aborted immediately.

Fixes: https://tracker.ceph.com/issues/52026

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
ronen-fr added a commit to ronen-fr/ceph that referenced this pull request Mar 25, 2022
Snap trimming that was postponed as the target PG was scrubbing
must be restarted at scrub completion.
PR ceph#38111 moved trimming restart to just before the scrub fully
terminated. The current PR fixes that.

Trimming is also restarted in those cases where scrub was
queued but aborted immediately.

Fixes: https://tracker.ceph.com/issues/52026

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
(cherry picked from commit 948d326)
ronen-fr added a commit to ronen-fr/ceph that referenced this pull request Apr 5, 2022
Snap trimming that was postponed as the target PG was scrubbing
must be restarted at scrub completion.
PR ceph#38111 moved trimming restart to just before the scrub fully
terminated. The current PR fixes that.

Trimming is also restarted in those cases where scrub was
queued but aborted immediately.

Fixes: https://tracker.ceph.com/issues/52026

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
(cherry picked from commit 948d326)

Conflicts:
	src/osd/pg_scrubber.cc

Conflict resolved by removing a clear_queued_or_active() call that
was dragged in.
ronen-fr added a commit to ronen-fr/ceph that referenced this pull request Apr 11, 2022
Snap trimming that was postponed as the target PG was scrubbing
must be restarted at scrub completion.
PR ceph#38111 moved trimming restart to just before the scrub fully
terminated. The current PR fixes that.

Trimming is also restarted in those cases where scrub was
queued but aborted immediately.

Fixes: https://tracker.ceph.com/issues/52026

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
(cherry picked from commit 948d326)

Conflicts:
	src/osd/pg_scrubber.cc

Conflict resolved by removing a clear_queued_or_active() call that
was dragged in.

(cherry picked from commit b010892)

Resolves: rhbz#2068531
zalsader pushed a commit to zalsader/ceph that referenced this pull request Apr 11, 2022
Snap trimming that was postponed as the target PG was scrubbing
must be restarted at scrub completion.
PR ceph#38111 moved trimming restart to just before the scrub fully
terminated. The current PR fixes that.

Trimming is also restarted in those cases where scrub was
queued but aborted immediately.

Fixes: https://tracker.ceph.com/issues/52026

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
mchangir pushed a commit to mchangir/ceph that referenced this pull request Apr 13, 2022
Snap trimming that was postponed as the target PG was scrubbing
must be restarted at scrub completion.
PR ceph#38111 moved trimming restart to just before the scrub fully
terminated. The current PR fixes that.

Trimming is also restarted in those cases where scrub was
queued but aborted immediately.

Fixes: https://tracker.ceph.com/issues/52026

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
(cherry picked from commit 948d326)

Conflicts:
	src/osd/pg_scrubber.cc

Conflict resolved by removing a clear_queued_or_active() call that
was dragged in.
dpaganel pushed a commit to dpaganel/ceph that referenced this pull request May 17, 2022
Snap trimming that was postponed as the target PG was scrubbing
must be restarted at scrub completion.
PR ceph#38111 moved trimming restart to just before the scrub fully
terminated. The current PR fixes that.

Trimming is also restarted in those cases where scrub was
queued but aborted immediately.

Fixes: https://tracker.ceph.com/issues/52026

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
mkogan1 pushed a commit to mkogan1/ceph that referenced this pull request Aug 9, 2022
Snap trimming that was postponed as the target PG was scrubbing
must be restarted at scrub completion.
PR ceph#38111 moved trimming restart to just before the scrub fully
terminated. The current PR fixes that.

Trimming is also restarted in those cases where scrub was
queued but aborted immediately.

Fixes: https://tracker.ceph.com/issues/52026

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
(cherry picked from commit 948d326)

Conflicts:
	src/osd/pg_scrubber.cc

Conflict resolved by removing a clear_queued_or_active() call that
was dragged in.

(cherry picked from commit b010892)

Resolves: rhbz#2068531
NitzanMordhai pushed a commit to NitzanMordhai/ceph that referenced this pull request Nov 19, 2023
Snap trimming that was postponed as the target PG was scrubbing
must be restarted at scrub completion.
PR ceph#38111 moved trimming restart to just before the scrub fully
terminated. The current PR fixes that.

Trimming is also restarted in those cases where scrub was
queued but aborted immediately.

Fixes: https://tracker.ceph.com/issues/52026

Signed-off-by: Ronen Friedman <rfriedma@redhat.com>
(cherry picked from commit 948d326)

Conflicts:
	src/osd/pg_scrubber.cc

Conflict resolved by removing a clear_queued_or_active() call that
was dragged in.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants