rgw: resolve empty ordered bucket listing results w/ CLS filtering#42125
Merged
cbodley merged 3 commits intoceph:masterfrom Aug 5, 2021
Merged
rgw: resolve empty ordered bucket listing results w/ CLS filtering#42125cbodley merged 3 commits intoceph:masterfrom
cbodley merged 3 commits intoceph:masterfrom
Conversation
f1c0ee7 to
0e4456d
Compare
aeb961a to
098819d
Compare
9ed02db to
a12389e
Compare
|
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
cbodley
reviewed
Aug 2, 2021
a12389e to
3314a15
Compare
3314a15 to
56d40db
Compare
When using asynchronous (concurrent) IO for bucket index requests, there are two int ids that are used that need to be kept separate -- shard id and request id. In many cases they're the same -- shard 0 gets request 0, and so forth. But in preparation for re-requests, those ids can diverge, where request 13 maps to shard 2. The existing code maintained the OIDs that went with each request. This PR also maintains the shard id as well. Documentation has been beefed up to help future developers navigate this. Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
When doing an asynchronous/concurrent bucket index operation against multiple bucket index shards, a special error code is set aside to indicate that an "advancing" retry of a/some shard(s) is necessary. In that case another asynchronous call is made on the indicated shard(s) from the client (i.e., CLSRGWConcurrentIO). It is up to the subclass of CLSRGWConcurrentIO to handle the retry such that it "advances" and simply doesn't get stuck, looping forever. The retry functionality only works when the "need_multiple_rounds" functionality is not in use. Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
A previous PR moved the much of the filtering that's part of bucket listing to the CLS layer. One unanticipated result was that it is now possible for a call to return 0 entries. In such a case we want to retry the call with the marker moved forward (i.e., advanced), repeatedly if necessary, in order to either retrieve some entries or to hit the end of the entries. This PR adds that functionality. Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
56d40db to
423c183
Compare
cbodley
approved these changes
Aug 4, 2021
This was referenced Feb 18, 2022
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
rgw: resolve empty ordered bucket listing results w/ CLS filtering
Bucket listing filtering was moved from the RGW lyer to the CLS layer to improve efficiency. However currently if there are enough entries that are filtered out, the CLS call may return zero entries back to RGW. Since we did not mark how far we got, calling it again will yield the same result, causing the process to be stuck and fail.
This solution adds a marker to the CLS call's return object to RGW. That will allow the next call to pick up where it left off.
Because the bucket index is spread across many shards, the CLSRGWConcurrentIO class is used to coordinate these asynchronous calls. Functionality is added to this class to handle re-issuing the call with the new marker to make sure at least one entry is returned.
Fixes: https://tracker.ceph.com/issues/51462
Signed-off-by: J. Eric Ivancich ivancich@redhat.com