Skip to content

rgw/multisite: add cls versioning for data sync per shard status object#47614

Closed
smanjara wants to merge 1 commit intoceph:mainfrom
smanjara:cls-version-data-sync
Closed

rgw/multisite: add cls versioning for data sync per shard status object#47614
smanjara wants to merge 1 commit intoceph:mainfrom
smanjara:cls-version-data-sync

Conversation

@smanjara
Copy link
Contributor

Signed-off-by: Shilpa Jagannath smanjara@redhat.com

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

Signed-off-by: Shilpa Jagannath <smanjara@redhat.com>
@github-actions github-actions bot added the rgw label Aug 15, 2022
@smanjara smanjara requested a review from cbodley August 15, 2022 20:20
Copy link
Contributor

@cbodley cbodley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a good start! we found all the places that need version tracking

as a next step, we need to arrange for each datalog shard to have its own instance of RGWObjVersionTracker, so it can use that same instance for every step of the process (Read, Init, List, and Sync). this way, we can detect racing writes at any point of the process, instead of just in between the new reads and their writes

RGWDataSyncCR has a member variable rgw_data_sync_status sync_status that stores a map of the per-shard sync markers, which gets passed to each of these coroutines RGWReadDataSyncStatusMarkersCR, RGWInitDataSyncStatusCoroutine, and RGWListBucketIndexesCR. each RGWDataSyncShardCR also gets a pointer to its own shard marker

we'll need to do something similar for the storage of these RGWObjVersionTrackers. for example, by adding a std::vector<RGWObjVersionTracker> to RGWDataSyncCR that gets passed along with the sync status and shard markers. RGWDataSyncShardCR should get its RGWObjVersionTracker& by reference, and share that with its RGWDataSyncShardMarkerTrack

@cbodley
Copy link
Contributor

cbodley commented Aug 16, 2022

p.s. just be aware of conflicts with @adamemerson's work in #47422 and #47566

@cbodley cbodley requested a review from adamemerson August 16, 2022 19:41
@github-actions
Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@smanjara
Copy link
Contributor Author

this is a good start! we found all the places that need version tracking

as a next step, we need to arrange for each datalog shard to have its own instance of RGWObjVersionTracker, so it can use that same instance for every step of the process (Read, Init, List, and Sync). this way, we can detect racing writes at any point of the process, instead of just in between the new reads and their writes

RGWDataSyncCR has a member variable rgw_data_sync_status sync_status that stores a map of the per-shard sync markers, which gets passed to each of these coroutines RGWReadDataSyncStatusMarkersCR, RGWInitDataSyncStatusCoroutine, and RGWListBucketIndexesCR. each RGWDataSyncShardCR also gets a pointer to its own shard marker

we'll need to do something similar for the storage of these RGWObjVersionTrackers. for example, by adding a std::vector<RGWObjVersionTracker> to RGWDataSyncCR that gets passed along with the sync status and shard markers. RGWDataSyncShardCR should get its RGWObjVersionTracker& by reference, and share that with its RGWDataSyncShardMarkerTrack

Addressing this in a rebased version #47682

@smanjara
Copy link
Contributor Author

Closing against #47682

@smanjara smanjara closed this Aug 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants