src/osd/scrubber: process MapsCompared event syncronously#51498
Merged
src/osd/scrubber: process MapsCompared event syncronously#51498
Conversation
Contributor
Author
ronen-fr
approved these changes
May 16, 2023
src/osd/scrubber/scrub_machine.cc
Outdated
| // maps_compare_n_cleanup() will arrange for MapsCompared event to be | ||
| // sent: | ||
| scrbr->maps_compare_n_cleanup(); | ||
| post_event(MapsCompared{}); |
Contributor
There was a problem hiding this comment.
Or simply transition here to WaitDigestUpdate
Contributor
Author
There was a problem hiding this comment.
Yep, good point. Updating
Summary: WaitReplica submits two asyncronous events through the OSD scheduler concurrently: DigestUpdate and MapsCompared. This causes the error described in https://tracker.ceph.com/issues/59049 because MapsCompared must be processed first, but they may be scheduled with differing priorities depending on whether there are writes blocked. This patch avoids the problem by simply processing MapsCompared syncronously. Details: Within the WaitReplicas state handler for GotReplicas, we invoke PGScrubber::maps_compare_n_cleanup: sc::result WaitReplicas::react(const GotReplicas&) { ... // maps_compare_n_cleanup() will arrange for MapsCompared event to be // sent: scrbr->maps_compare_n_cleanup(); return discard_event(); void PgScrubber::maps_compare_n_cleanup() { ... auto required_fixes = m_be->scrub_compare_maps( m_end.is_max(), m_listener->sl_get_snap_map_reader()); ... m_osds->queue_scrub_maps_compared(m_pg, Scrub::scrub_prio_t::low_priority); PgScrubber::maps_compare_n_cleanup arranges for the MapsCompared event to be submitted through the OSD schdeduler to be delivered to the scrub state machine later. It also invokes scrub_compare_maps objs_fix_list_t ScrubBackend::scrub_compare_maps( bool max_reached, SnapMapReaderI& snaps_getter) { ... scrub_snapshot_metadata(for_meta_scrub); which in turn invokes scrub_snapshot_metadata void ScrubBackend::scrub_snapshot_metadata(ScrubMap& map) { ... m_scrubber.submit_digest_fixes(this_chunk->missing_digest); which in turn invokes submit_digest_fixes void PrimaryLogScrub::submit_digest_fixes(const digests_fixes_t& fixes) { ... for (auto& [obj, dgs] : fixes) { ... PrimaryLogPG::OpContextUPtr ctx = m_pl_pg->simple_opc_create(obc); ... m_pl_pg->finish_ctx(ctx.get(), pg_log_entry_t::MODIFY); ... ctx->register_on_success([this]() { if ((num_digest_updates_pending >= 1) && (--num_digest_updates_pending == 0)) { m_osds->queue_scrub_digest_update(m_pl_pg, m_pl_pg->is_scrub_blocking_ops()); } }); m_pl_pg->simple_opc_submit(std::move(ctx)); } } which submits a sequence of repops updating objects' digests and schduling the DigestUpdate event once they have completed. These two events, DigestUpdate and MapsCompared, are therefore in flight concurrently. However, only one event ordering is actually tolerated -- we must process MapsCompared first in order to be in state WaitDigestUpdate when the DigestUpdate event arrives. This would be fine except that the priority at which we submit the DigestUpdate event may end up with a higher priority if there is currently an op blocked. There are two ways we could solve this. We could ensure that the two are always queued at the same priority and therefore will be processed in order. However, there's actually no reason to process the MapsCompared event asyncronously -- it would be far less brittle to simply process the MapsCompared syncronously and avoid this problem entirely. Fixes: https://tracker.ceph.com/issues/59049 Signed-off-by: Samuel Just <sjust@redhat.com>
da0f30d to
604860e
Compare
Contributor
Author
|
Ran a short suite on the second version: https://pulpito.ceph.com/sjust-2023-05-16_18:04:35-rados-wip-sjust-testing-2023-05-16-distro-default-smithi/, seems ok. Will wait for full suite run |
Contributor
Author
|
jenkins test docs |
Contributor
Author
|
jenkins test windows |
Contributor
Author
|
jenkins test make check arm64 |
Contributor
Author
|
jenkins render docs |
Member
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary:
WaitReplica submits two asyncronous events through the OSD scheduler concurrently: DigestUpdate and MapsCompared. This causes the error described in https://tracker.ceph.com/issues/59049 because MapsCompared must be processed first, but they may be scheduled with differing priorities depending on whether there are writes blocked. This patch avoids the problem by simply processing MapsCompared syncronously.
Details:
Within the WaitReplicas state handler for GotReplicas, we invoke PGScrubber::maps_compare_n_cleanup:
PgScrubber::maps_compare_n_cleanup arranges for the MapsCompared event to be submitted through the OSD schdeduler to be delivered to the scrub state machine later. It also invokes scrub_compare_maps
which in turn invokes scrub_snapshot_metadata
which in turn invokes submit_digest_fixes
which submits a sequence of repops updating objects' digests and schduling the DigestUpdate event once they have completed.
These two events, DigestUpdate and MapsCompared, are therefore in flight concurrently. However, only one event ordering is actually tolerated -- we must process MapsCompared first in order to be in state WaitDigestUpdate when the DigestUpdate event arrives. This would be fine except that the priority at which we submit the DigestUpdate event may end up with a higher priority if there is currently an op blocked.
There are two ways we could solve this. We could ensure that the two are always queued at the same priority and therefore will be processed in order. However, there's actually no reason to process the MapsCompared event asyncronously -- it would be far less brittle to simply process the MapsCompared syncronously and avoid this problem entirely.
Fixes: https://tracker.ceph.com/issues/59049
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windows