Skip to content

rgw: Break Full and Incremental sync out into their own classes#47422

Merged
cbodley merged 31 commits intoceph:mainfrom
adamemerson:wip-coroutine-cleaver
Aug 17, 2022
Merged

rgw: Break Full and Incremental sync out into their own classes#47422
cbodley merged 31 commits intoceph:mainfrom
adamemerson:wip-coroutine-cleaver

Conversation

@adamemerson
Copy link
Contributor

@adamemerson adamemerson commented Aug 3, 2022

That way we don't end up going down the wrong fork when Full is finishing up.

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

@adamemerson adamemerson requested review from cbodley and mkogan1 August 3, 2022 00:11
@github-actions github-actions bot added the rgw label Aug 3, 2022
@adamemerson adamemerson force-pushed the wip-coroutine-cleaver branch from 38406d6 to 4056335 Compare August 3, 2022 03:55
Copy link
Contributor

@cbodley cbodley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@adamemerson
Copy link
Contributor Author

@cbodley Is there an upstream tracker for this or should I open one?

@cbodley
Copy link
Contributor

cbodley commented Aug 8, 2022

@cbodley Is there an upstream tracker for this or should I open one?

@adamemerson if you think this is worth backporting to quincy/pacific, sure

Shilpa Jagannath and others added 19 commits August 8, 2022 15:44
… of a

bucket to error repo instead of triggering a bucket sync on them.

Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
… struct.

Call remove_cr() to remove full sync obligations after writing all
shard entries in to error repo. Replace call() with spawn() and yield_spawn_window()

Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
…ndow

Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
…cShardCR

Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
…class.

Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
handle retcodes correctly.

Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
…ully.

contributed by cbodley
Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit b3662cc)
Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
(cherry picked from commit b3662cc)
… and

construct error repo using datalog_oid_for_error_repo()

Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
…ngleEntryCR

Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
…lete

before spawning shards of next generation

Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
@adamemerson adamemerson force-pushed the wip-coroutine-cleaver branch from 982709d to ff1a38f Compare August 8, 2022 20:30
Copy link
Contributor

@cbodley cbodley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed some bugs and cleaned up the locking. it's passing multisite tests

@cbodley
Copy link
Contributor

cbodley commented Aug 9, 2022

@adamemerson could you please look over the commits i added?

Copy link
Contributor Author

@adamemerson adamemerson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cbodley Looks good to me.

@mattbenjamin
Copy link
Contributor

fixed some bugs and cleaned up the locking. it's passing multisite tests

superb!

adamemerson and others added 8 commits August 9, 2022 15:29
The problem:

RGWDataSyncShardCR combines full and incremental sync in such a way
that yielding in full sync after the transition to incremental (but
before all the finishing up is done) can jump straight into
incremental.

To deal with this break full and incremental sync into their own
coroutine classes. For the shared functionality (but /not/ shared
data) between them, we have a base class into which parts of the
original class are moved.

Data that are shared between the two (shard cache, lease) are owned
and managed by RGWDataSyncShardCR, with the full and incremental base
class getting references.

Fixes: https://tracker.ceph.com/issues/57063
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
This function was formerly RGWDataSyncShardCR::full_sync. The only
functional difference is that we leave acquiring the lease to the top
level RGWDataSyncShardCR coroutine class, since the lease should be
held on the transition from full to incremental sync.

Fixes: https://tracker.ceph.com/issues/57063
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
This was formerly the function
RGWDataSyncShardCR::incremental_sync. As with full_sync, we transfer
responsibility for acquiring the lease to the top level
RGWDataSyncShardCR coroutine.

Fixes: https://tracker.ceph.com/issues/57063
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Rewrite operate() to use reenter and handle lease acquisition at the
top level, calling down into the Full and Incremental sync shard
coroutines for the actual work.

Fixes: https://tracker.ceph.com/issues/57063
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Remove no-longer-used functions and data members they depended on.

Fixes: https://tracker.ceph.com/issues/57063
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
moves the lease-related error handling out of DataFull/DataInc and into
DataSyncShardCR

Signed-off-by: Casey Bodley <cbodley@redhat.com>
@cbodley cbodley force-pushed the wip-coroutine-cleaver branch from 771c4a5 to e26b987 Compare August 9, 2022 19:32
cbodley added 4 commits August 9, 2022 15:34
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
…po on success

Signed-off-by: Casey Bodley <cbodley@redhat.com>
@cbodley
Copy link
Contributor

cbodley commented Aug 15, 2022

jenkins test make check

@cbodley
Copy link
Contributor

cbodley commented Aug 15, 2022

mostly passed qa in https://pulpito.ceph.com/cbodley-2022-08-11_19:52:36-rgw-wip-coroutine-cleaver-distro-default-smithi/, but multisite tests all failed due to #47411

@cbodley
Copy link
Contributor

cbodley commented Aug 15, 2022

jenkins test make check

@cbodley
Copy link
Contributor

cbodley commented Aug 17, 2022

verified manually with multisite tests, only failure was from test_multi.test_version_suspended_incremental_sync tracked in https://tracker.ceph.com/issues/55179 👍

@cbodley cbodley merged commit 77558ef into ceph:main Aug 17, 2022
@adamemerson adamemerson deleted the wip-coroutine-cleaver branch September 15, 2022 21:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants