Skip to content

rgw: prevent spurious/lost notifications in the index completion thread#45131

Closed
yuvalif wants to merge 1 commit intoceph:wip-rgw-multisite-reshardfrom
yuvalif:wip-yuval-completion-dpp
Closed

rgw: prevent spurious/lost notifications in the index completion thread#45131
yuvalif wants to merge 1 commit intoceph:wip-rgw-multisite-reshardfrom
yuvalif:wip-yuval-completion-dpp

Conversation

@yuvalif
Copy link
Contributor

@yuvalif yuvalif commented Feb 23, 2022

this was happening when async completions happened during reshard.
more information about testing:
https://gist.github.com/yuvalif/d526c0a3a4c5b245b9e951a6c5a10517

to trace possible issues, dpp was added to the completion thread.
this should allow finding unhandled completions due to reshards. for example:
https://0x0.st/oKQ1.log

Signed-off-by: Yuval Lifshitz ylifshit@redhat.com

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox

@yuvalif yuvalif added the rgw label Feb 23, 2022
@yuvalif yuvalif changed the title rgw: add dpp to the completion thread and manager rgw: prevent spurious/lost notifications in the index completion thread Feb 24, 2022
entry->bilog_op = bilog_op;
std::ostringstream oss;
dpp->gen_prefix(oss);
entry->original_prefix = oss.str();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i suggested doing this for debugging, but i'd prefer not to merge it

this adds overhead (allocations and a global lock for iostream locale support) to every index operation, but is only relevant to a small subset of those (writes that prepared before reshard started, but completed after)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i added debug logs to handle_completion() even for the non-reshard cases.
since these are async callbacks where the original dpp context is not lost (they are synchronous with the op)
i can probably pass in the original dpp, without copying the prefix string


void RGWIndexCompletionManager::process()
{
auto dpp = std::make_unique<DoutPrefix>(store->ctx(), dout_subsys, "rgw index completion thread: ");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't this live on the stack?

const int num_shards;
ceph::containers::tiny_vector<ceph::mutex> locks;
std::vector<set<complete_op_data*>> completions;
std::vector<complete_op_data*> async_completions;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the async_ piece is a bit misleading here? it's completions that were submitted asynchronously with aio_operate(). the completions in this vector failed with ERR_BUSY_RESHARDING and need to be retried synchronously. maybe call this retry_completions?

what was the motivation for collapsing RGWIndexCompletionThread into RGWIndexCompletionManager? i think it's better to keep the Thread class and encapsulate its state/locking there. that way we don't mix completions/locks and async_completions/async_completions_lock together in one place

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the async_ piece is a bit misleading here? it's completions that were submitted asynchronously with aio_operate(). the completions in this vector failed with ERR_BUSY_RESHARDING and need to be retried synchronously. maybe call this retry_completions?

sure, will rename

what was the motivation for collapsing RGWIndexCompletionThread into RGWIndexCompletionManager? i think it's better to keep the Thread class and encapsulate its state/locking there. that way we don't mix completions/locks and async_completions/async_completions_lock together in one place

there is no thread class. it is just a function called from the manager class inside its own thread.
both the async_completion_lock is used inside functions called from different thread contexts, os it belongs to the manager class

Comment on lines +936 to +941
std::unique_lock l{async_completions_lock};
cond.wait(l, [this](){return _stop || !async_completions.empty();});
if (_stop) {
return;
}
async_completions.swap(comps);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@yuvalif
Copy link
Contributor Author

yuvalif commented Feb 28, 2022

  • when testing with 2 zones (1 rgw in each zone), and uploading objects only to primary i do not see any difference between the zones
  • when testing with 2 zones (1 rgw in each zone), and uploading objects only to the same bucket in both zones, there is no difference between the objects, however, the 2ndary zone sync status is "behind". see:
    https://paste.sh/aGzWOiJ8#UyGOOVvdILeFknWgd-nioBXd

@yuvalif yuvalif force-pushed the wip-yuval-completion-dpp branch from 2918b9b to 359db94 Compare February 28, 2022 18:32
Comment on lines +1401 to +1407
int cls_obj_complete_op(const DoutPrefixProvider *dpp, BucketShard& bs, const rgw_obj& obj, RGWModifyOp op, std::string& tag, int64_t pool, uint64_t epoch,
rgw_bucket_dir_entry& ent, RGWObjCategory category, std::list<rgw_obj_index_key> *remove_objs, uint16_t bilog_flags, rgw_zone_set *zones_trace = nullptr);
int cls_obj_complete_add(BucketShard& bs, const rgw_obj& obj, std::string& tag, int64_t pool, uint64_t epoch, rgw_bucket_dir_entry& ent,
int cls_obj_complete_add(const DoutPrefixProvider *dpp, BucketShard& bs, const rgw_obj& obj, std::string& tag, int64_t pool, uint64_t epoch, rgw_bucket_dir_entry& ent,
RGWObjCategory category, std::list<rgw_obj_index_key> *remove_objs, uint16_t bilog_flags, rgw_zone_set *zones_trace = nullptr);
int cls_obj_complete_del(BucketShard& bs, std::string& tag, int64_t pool, uint64_t epoch, rgw_obj& obj,
int cls_obj_complete_del(const DoutPrefixProvider *dpp, BucketShard& bs, std::string& tag, int64_t pool, uint64_t epoch, rgw_obj& obj,
ceph::real_time& removed_mtime, std::list<rgw_obj_index_key> *remove_objs, uint16_t bilog_flags, rgw_zone_set *zones_trace = nullptr);
int cls_obj_complete_cancel(BucketShard& bs, std::string& tag, rgw_obj& obj,
int cls_obj_complete_cancel(const DoutPrefixProvider *dpp, BucketShard& bs, std::string& tag, rgw_obj& obj,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this adds a dpp to lots of functions, apparently for RGWIndexCompletionManager::create_completion(), but the new dpp argument is unused there. can we safely remove these now? that would simplify backports

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. removing dpp. as it is too expensive to cache it inside the callback args

this was happening when asyn completions happened during reshard.
more information about testing:
https://gist.github.com/yuvalif/d526c0a3a4c5b245b9e951a6c5a10517

we also add more logs to the completion manager.
should allow finding unhandled completions due to reshards.

Signed-off-by: Yuval Lifshitz <ylifshit@redhat.com>
@yuvalif yuvalif force-pushed the wip-yuval-completion-dpp branch from 359db94 to 1cc5fdf Compare February 28, 2022 19:31
@cbodley
Copy link
Contributor

cbodley commented Feb 28, 2022

merged, thanks @yuvalif!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants