Skip to content

[WIP]rgw: update bucket sync status after bucket shards finishes current gen#40605

Closed
smanjara wants to merge 82 commits intoceph:wip-rgw-multisite-reshardfrom
smanjara:wip-rgw-multisite-reshard-sync-status
Closed

[WIP]rgw: update bucket sync status after bucket shards finishes current gen#40605
smanjara wants to merge 82 commits intoceph:wip-rgw-multisite-reshardfrom
smanjara:wip-rgw-multisite-reshard-sync-status

Conversation

@smanjara
Copy link
Contributor

@smanjara smanjara commented Apr 5, 2021

Signed-off-by: Shilpa Jagannath smanjara@redhat.com

Checklist

  • References tracker ticket
  • Updates documentation if necessary
  • Includes tests for new functionality or reproducer for bug

Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox

cbodley and others added 30 commits February 4, 2021 16:11
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
allows other code to spawn this coroutine without having the class
definition

Signed-off-by: Casey Bodley <cbodley@redhat.com>
RGWShardCollectCR was hard-coded to ignore ENOENT errors and print a
'failed to fetch log status' error message. this moves that logic into a
handle_result() virtual function. it also exposes the member variables
'status' and 'max_concurrent' as protected, so they can be consulted or
modified by overrides of handle_result() and spawn_next()

Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
a coroutine to initialize a bucket for full sync using a new bucket-wide
sync status object

Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
full sync happens as the bucket level, so the shards will always start
in StateIncrementalSync

Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
renamed ListBucketShardCR to ListRemoteBucketCR and removed the shard-id
parameter

renamed BucketFullSyncShardMarkerTrack to BucketFullSyncMarkerTrack,
which now updates the bucket-level rgw_bucket_sync_status

renamed BucketShardFullSyncCR to BucketFullSyncCR

BucketSyncCR now takes a bucket-wide lease during full sync

Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
if metadata sync hasn't finished, the 'bucket checkpoint' commands may
not find its bucket info

Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
the ability to filter tests by attribute is provided by the
nose.plugins.attrib plugin, which wasn't being loaded by default

Signed-off-by: Casey Bodley <cbodley@redhat.com>
this backoff is triggered often by the per-bucket lease for full sync,
and causes tests to fail with checkpoint timeouts

Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
adds a backward-compatible binary encoding for error repo keys that can
contain a generation number along with the bucket and shard

Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
@smanjara smanjara force-pushed the wip-rgw-multisite-reshard-sync-status branch 2 times, most recently from e3f6d92 to 221a8d7 Compare April 21, 2021 17:48
@smanjara
Copy link
Contributor Author

@cbodley addressed all comments. Please review. Thanks.

@smanjara smanjara force-pushed the wip-rgw-multisite-reshard-sync-status branch from 4a0ae3e to e916a63 Compare April 26, 2021 12:34
@adamemerson adamemerson self-assigned this Apr 29, 2021
@mattbenjamin
Copy link
Contributor

@smanjara @adamemerson what's the current plan for this one? Is this merging shortly?

@smanjara
Copy link
Contributor Author

smanjara commented Apr 30, 2021

@smanjara @adamemerson what's the current plan for this one? Is this merging shortly?

It needs another round of review by @cbodley

Copy link
Contributor

@cbodley cbodley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great!

if the old index is still referenced by an InIndex log layout, we can't
call clean_index() to remove the index objects yet. log trimming will do
that later, once the bilogs are no longer needed

Signed-off-by: Casey Bodley <cbodley@redhat.com>
@smanjara smanjara force-pushed the wip-rgw-multisite-reshard-sync-status branch from e916a63 to 5bff3af Compare May 3, 2021 11:50
@cbodley
Copy link
Contributor

cbodley commented May 3, 2021

now that we're flagging the shards as done, we should also prevent another incremental sync from starting on done shards. can you add a check for that in RGWSyncBucketCR before it spawns RGWSyncBucketShardCR?

@smanjara
Copy link
Contributor Author

smanjara commented May 4, 2021

now that we're flagging the shards as done, we should also prevent another incremental sync from starting on done shards. can you add a check for that in RGWSyncBucketCR before it spawns RGWSyncBucketShardCR?

done. Since the check for current gen is already present #39396, I have just added a check to see if the shard is done or not before spawning RGWSyncBucketShardCR

@smanjara smanjara force-pushed the wip-rgw-multisite-reshard-sync-status branch from f61a8e0 to 99ba8fc Compare May 4, 2021 14:15
sync_pipe, bucket_status.state,
tn, progress));
if (retcode < 0) {
return set_cr_error(retcode);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like the indentation is off for this whole block

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it should have been under Incremental sync block. thanks

@cbodley
Copy link
Contributor

cbodley commented May 4, 2021

once you fix the indentation, can you please squash the commits?

@smanjara smanjara force-pushed the wip-rgw-multisite-reshard-sync-status branch 2 times, most recently from b6bca55 to 733eb58 Compare May 5, 2021 07:08
Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
@smanjara smanjara force-pushed the wip-rgw-multisite-reshard-sync-status branch from 733eb58 to 6d7bb1c Compare May 5, 2021 07:14
@smanjara
Copy link
Contributor Author

smanjara commented May 5, 2021

once you fix the indentation, can you please squash the commits?

done

@adamemerson adamemerson force-pushed the wip-rgw-multisite-reshard branch from 36c2959 to 95355ab Compare May 12, 2021 16:05
@github-actions
Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@cbodley
Copy link
Contributor

cbodley commented May 14, 2021

rebased (fixed some dpp conflicts) and merged into wip-rgw-multisite-reshard. thanks!

@cbodley cbodley closed this May 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants