src/test: Test case for bilog trim across reshard event by TRYTOBE8TME · Pull Request #44758 · ceph/ceph

TRYTOBE8TME · 2022-01-24T16:42:05Z

Resharding a bucket and then performing the bilog trim

Signed-off-by: Kalpesh Pandya kapandya@redhat.com

Checklist

Tracker (select at least one)
- References tracker ticket
- Very recent bug; references commit where it was introduced
- New feature (ticket optional)
- Doc update (no ticket needed)
- Code cleanup (no ticket needed)
Component impact
- Affects Dashboard, opened tracker ticket
- Affects Orchestrator, opened tracker ticket
- No impact that needs to be tracked
Documentation (select at least one)
- Updates relevant documentation
- No doc update is appropriate
Tests (select at least one)
- Includes unit test(s)
- Includes integration test(s)
- Includes bug reproducer
- No tests

Show available Jenkins commands

jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox

Signed-off-by: Casey Bodley <cbodley@redhat.com>

allows other code to spawn this coroutine without having the class definition Signed-off-by: Casey Bodley <cbodley@redhat.com>

RGWShardCollectCR was hard-coded to ignore ENOENT errors and print a 'failed to fetch log status' error message. this moves that logic into a handle_result() virtual function. it also exposes the member variables 'status' and 'max_concurrent' as protected, so they can be consulted or modified by overrides of handle_result() and spawn_next() Signed-off-by: Casey Bodley <cbodley@redhat.com>

Signed-off-by: Casey Bodley <cbodley@redhat.com>

a coroutine to initialize a bucket for full sync using a new bucket-wide sync status object Signed-off-by: Casey Bodley <cbodley@redhat.com>

Signed-off-by: Casey Bodley <cbodley@redhat.com>

full sync happens as the bucket level, so the shards will always start in StateIncrementalSync Signed-off-by: Casey Bodley <cbodley@redhat.com>

Signed-off-by: Casey Bodley <cbodley@redhat.com>

renamed ListBucketShardCR to ListRemoteBucketCR and removed the shard-id parameter renamed BucketFullSyncShardMarkerTrack to BucketFullSyncMarkerTrack, which now updates the bucket-level rgw_bucket_sync_status renamed BucketShardFullSyncCR to BucketFullSyncCR BucketSyncCR now takes a bucket-wide lease during full sync Signed-off-by: Casey Bodley <cbodley@redhat.com>

Signed-off-by: Casey Bodley <cbodley@redhat.com>

if metadata sync hasn't finished, the 'bucket checkpoint' commands may not find its bucket info Signed-off-by: Casey Bodley <cbodley@redhat.com>

Signed-off-by: Casey Bodley <cbodley@redhat.com>

the ability to filter tests by attribute is provided by the nose.plugins.attrib plugin, which wasn't being loaded by default Signed-off-by: Casey Bodley <cbodley@redhat.com>

this backoff is triggered often by the per-bucket lease for full sync, and causes tests to fail with checkpoint timeouts Signed-off-by: Casey Bodley <cbodley@redhat.com>

Signed-off-by: Casey Bodley <cbodley@redhat.com>

Determining whether a bucket is indexless starting with an RGWBucketInfo object requires traversing multiple data structures and "inside knowledge" blurring the line between interface and implementation. The same applies for retrieving the current index for non-indexless buckets. This commit adds to the RGWBucketInfo interface to make this information readily accessible. Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>

The code for bucket stats was recently updated to check for an indexless bucket before proceeding. The interface on RGWBucketInfo was recently expanded to support these types of checks, so it is now used. Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>

The "bucket radoslist" sub-command of radosgw-admin is supposed to list all rados objects tied to one or all directories and thereby provide a way to determine orphaned rados objects. But indexless buckets don't provide an index to employ for this purpose. So warnings or errors should be provided depending on the circumstances. Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>

With the new resharding code, some bucket metadata that is stored as xattrs (e.g., ACLs, life-cycle policies) were not sent with the updated bucket instance data when resharding completed. As a result, resharding has a regression where that metadata is lost after a successful reshard. This commit restores the variable in the RGWBucketReshard class that maintains the bucket attributes, so they can be saved when the bucket instance object is updated. Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>

There appears to be a long-standing bug in RGW such that when resharding is cancelled and the bucket instance is updated to reflect the new resharding status, the xattrs were lost. The xattrs are used to store metadata such as ACLs and LifeCycle policies. This commit makes sure that all call paths that lead to a cancelled reshard provide the xattrs, so they can be included when the bucket instance info is updated. Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>

Signed-off-by: Casey Bodley <cbodley@redhat.com>

use an API that does not check for cache inconsistency hence, "WARNING: The bucket info cache is inconsistent" warnings is removed from reshard Signed-off-by: Yuval Lifshitz <ylifshit@redhat.com>

github-actions · 2022-02-02T03:15:16Z

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

Resharding a bucket and then performing the bilog trim Signed-off-by: Kalpesh Pandya <kapandya@redhat.com>

cbodley · 2022-02-08T20:55:21Z

i think that will require radosgw-admin metadata get on the bucket instance metadata, and inspecting its array of layout.logs

i added a new radosgw-admin bucket layout --bucket=name command in #44947 to simplify this part

Signed-off-by: Kalpesh Pandya <kapandya@redhat.com>

TRYTOBE8TME · 2022-02-14T15:56:21Z

trim also has new logic to delete the old log generations entirely, so i'd like to verify that part too. i think that will require radosgw-admin metadata get on the bucket instance metadata, and inspecting its array of layout.logs

@cbodley Can you please elaborate this part or maybe can give a hint on how this can be done?

cbodley · 2022-02-14T21:39:11Z

hey @TRYTOBE8TME, i'd start by creating a bucket with some objects in it, then playing around with radosgw-admin bucket layout, and see how it changes after you run radosgw-admin bucket reshard. each time you reshard, you should see an extra entry in the logs, until you hit the maximum number of logs at 4

then, once the other zone is all caught up with sync, you can run radosgw-admin bilog autotrim and see how that changes the output of radosgw-admin bucket layout. each time bilog autotrim runs, you should see one less entry the list of logs, until there's only one entry left

this is exactly what we want to write a test for; for example, do 2 reshards and verify that bucket layout shows 3 logs. then do a bucket checkpoint to wait for sync to catch up. then bilog autotrim and verify that bucket layout removes 1 of the logs, then that another bilog autotrim removes another. finally, verify that bilog autotrim doesn't remove the last log

adamemerson · 2022-02-22T18:26:06Z

@TRYTOBE8TME Before you get on to anything else, can you rebase this?

TRYTOBE8TME · 2022-02-23T06:15:34Z

@TRYTOBE8TME Before you get on to anything else, can you rebase this?

@adamemerson sorry, I've created a new alias PR for this #45053 and would be closing this

github-actions · 2022-04-14T15:36:18Z

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

TRYTOBE8TME requested a review from cbodley January 24, 2022 16:42

github-actions bot added rgw tests labels Jan 24, 2022

cbodley and others added 27 commits January 31, 2022 14:07

rgw: RGWSimpleRadosReadCR copies out objv_tracker

3b40661

Signed-off-by: Casey Bodley <cbodley@redhat.com>

rgw: remove unused RGWRunBucketsSyncBySourceCR

8e5ec78

Signed-off-by: Casey Bodley <cbodley@redhat.com>

rgw: rename to rgw_read_bucket_inc_sync_status

6b253e7

Signed-off-by: Casey Bodley <cbodley@redhat.com>

rgw: add data structures for bucket sync status

c3e0c54

Signed-off-by: Casey Bodley <cbodley@redhat.com>

rgw: rename to inc_status_oid

b0f4307

Signed-off-by: Casey Bodley <cbodley@redhat.com>

rgw: use const for string constants in rgw_data_sync.cc

4fd3467

Signed-off-by: Casey Bodley <cbodley@redhat.com>

rgw: add full_status_oid() for buckets

31d9b92

Signed-off-by: Casey Bodley <cbodley@redhat.com>

rgw: rename to RGWSyncBucketShardCR

05043bd

Signed-off-by: Casey Bodley <cbodley@redhat.com>

rgw: add sync_bucket_shard_cr() factory function

23c6901

allows other code to spawn this coroutine without having the class definition Signed-off-by: Casey Bodley <cbodley@redhat.com>

rgw: add exclusive flag to RGWSimpleRadosWriteCR

80c4b52

Signed-off-by: Casey Bodley <cbodley@redhat.com>

rgw: system objects can set exclusive on set_attrs()

29def66

Signed-off-by: Casey Bodley <cbodley@redhat.com>

rgw: RGWSimpleRadosWriteAttrsCR supports exclusive create

2e40dfe

Signed-off-by: Casey Bodley <cbodley@redhat.com>

rgw: add InitBucketFullSyncStatusCR

2521109

a coroutine to initialize a bucket for full sync using a new bucket-wide sync status object Signed-off-by: Casey Bodley <cbodley@redhat.com>

rgw: split SyncBucket from SyncBucketShard

581498a

Signed-off-by: Casey Bodley <cbodley@redhat.com>

rgw: InitBucketShardSyncStatus always sets state to Incremental

c4b4cea

full sync happens as the bucket level, so the shards will always start in StateIncrementalSync Signed-off-by: Casey Bodley <cbodley@redhat.com>

rgw: add rgw_read_bucket_full_sync_status()

8a76dda

Signed-off-by: Casey Bodley <cbodley@redhat.com>

radosgw-admin: 'bucket sync status' displays new full sync status

c572ce5

Signed-off-by: Casey Bodley <cbodley@redhat.com>

radosgw-admin: 'bucket sync checkpoint' waits for full sync

6af6178

Signed-off-by: Casey Bodley <cbodley@redhat.com>

rgw: fix up BucketShardIncrementalSync log message

fdcd77b

Signed-off-by: Casey Bodley <cbodley@redhat.com>

rgw: RGWSyncBucketCR holds lease over Init state too

22e16c5

Signed-off-by: Casey Bodley <cbodley@redhat.com>

qa/rgw: add missing meta checkpoint to test_multipart_object_sync

ce54cbd

if metadata sync hasn't finished, the 'bucket checkpoint' commands may not find its bucket info Signed-off-by: Casey Bodley <cbodley@redhat.com>

qa/rgw: disable multisite tests for 'bucket sync disable'

aff3512

Signed-off-by: Casey Bodley <cbodley@redhat.com>

qa/rgw: rgw_multisite_tests task loads default plugins

301248e

the ability to filter tests by attribute is provided by the nose.plugins.attrib plugin, which wasn't being loaded by default Signed-off-by: Casey Bodley <cbodley@redhat.com>

rgw: disable backoff on data sync error_retry_time

276283e

this backoff is triggered often by the per-bucket lease for full sync, and causes tests to fail with checkpoint timeouts Signed-off-by: Casey Bodley <cbodley@redhat.com>

rgw: remove rgw_bucket_shard_sync_info::full_marker

facd4b5

Signed-off-by: Casey Bodley <cbodley@redhat.com>

ivancich and others added 7 commits February 1, 2022 18:50

test/rgw: test_bucket_reshard verifies that ACLs are preserved

0045f15

Signed-off-by: Casey Bodley <cbodley@redhat.com>

rgw/reshard: resolve inconsistent cache warnings

8800bf7

use an API that does not check for cache inconsistency hence, "WARNING: The bucket info cache is inconsistent" warnings is removed from reshard Signed-off-by: Yuval Lifshitz <ylifshit@redhat.com>

adamemerson force-pushed the wip-rgw-multisite-reshard branch from b6342d0 to 8800bf7 Compare February 2, 2022 00:10

github-actions bot added the needs-rebase label Feb 2, 2022

src/test: Test case for bilog trim across reshard event

92c199d

Resharding a bucket and then performing the bilog trim Signed-off-by: Kalpesh Pandya <kapandya@redhat.com>

cbodley mentioned this pull request Feb 7, 2022

rgw multisite: bucket reshard work in progress #39002

Merged

31 tasks

adamemerson force-pushed the wip-rgw-multisite-reshard branch from d0f01cf to fe5ea5e Compare February 9, 2022 16:35

src/test: Extra changes as per the comment

4fd8ad2

Signed-off-by: Kalpesh Pandya <kapandya@redhat.com>

TRYTOBE8TME force-pushed the wip-rgw-bilog-tests-add branch from 3e0b8b9 to 4fd8ad2 Compare February 14, 2022 09:24

github-actions bot added build/ops common documentation labels Feb 14, 2022

nizamial09 removed the dashboard label Feb 15, 2022

Merge branch 'wip-rgw-multisite-reshard' into wip-rgw-bilog-tests-add

4ce873c

github-actions bot removed the needs-rebase label Feb 16, 2022

TRYTOBE8TME mentioned this pull request Feb 16, 2022

src/test: Wip rgw bilog autotrim tests #45053

Closed

14 tasks

github-actions bot added the needs-rebase label Apr 14, 2022

cbodley closed this Apr 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

src/test: Test case for bilog trim across reshard event#44758

src/test: Test case for bilog trim across reshard event#44758
TRYTOBE8TME wants to merge 150 commits intoceph:wip-rgw-multisite-reshardfrom
TRYTOBE8TME:wip-rgw-bilog-tests-add

TRYTOBE8TME commented Jan 24, 2022

Uh oh!

github-actions bot commented Feb 2, 2022

Uh oh!

cbodley commented Feb 8, 2022

Uh oh!

TRYTOBE8TME commented Feb 14, 2022

Uh oh!

cbodley commented Feb 14, 2022

Uh oh!

adamemerson commented Feb 22, 2022

Uh oh!

TRYTOBE8TME commented Feb 23, 2022 •

edited

Loading

Uh oh!

github-actions bot commented Apr 14, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Conversation

TRYTOBE8TME commented Jan 24, 2022

Checklist

Uh oh!

github-actions bot commented Feb 2, 2022

Uh oh!

cbodley commented Feb 8, 2022

Uh oh!

TRYTOBE8TME commented Feb 14, 2022

Uh oh!

cbodley commented Feb 14, 2022

Uh oh!

adamemerson commented Feb 22, 2022

Uh oh!

TRYTOBE8TME commented Feb 23, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Apr 14, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

TRYTOBE8TME commented Feb 23, 2022 •

edited

Loading