Skip to content

rgw: Avoid spamming OSD requests on startup#38230

Closed
adamemerson wants to merge 11 commits intoceph:masterfrom
adamemerson:wip-mark-logbacking
Closed

rgw: Avoid spamming OSD requests on startup#38230
adamemerson wants to merge 11 commits intoceph:masterfrom
adamemerson:wip-mark-logbacking

Conversation

@adamemerson
Copy link
Contributor

Currently we send multiple OSD requests on startup to initialize the datalog. This is too many. To reduce this we mark a log (the set of shards) as usable with an omap entry on shard zero and take this as true unless something conflicts with it.

Secondly, initialize FIFOs lazily, so if we aren't doing anything relevant to the data log, don't pay to set them up.

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Provide quick and lengthy checks for the backing store of sharded
logs, including a marker on shard 0 to speed things up.

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Also don't use svc_cls.

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
LazyFIFO opens the FIFO on first access.

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Because everybody's sick of not having a container for
non-default-constructible, immovable objects.

Or at least I am.

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
That way we don't start sending ops to open a FIFO until we need it.

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
To manually complete an asynchronous librados call.

Signed-off-by: Adam C. Emerson <aemerson@redhat.com>
Copy link
Contributor

@mattbenjamin mattbenjamin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't know if I can legitimately approve, but it looks sane

int r_out = 0;
op.omap_get_vals_by_keys({ logmark_omap_key }, &values, &r_out);

auto oid = get_oid(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_oid(0)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We keep the 'check' value on the zeroth shard, so after checking all the shards once, we only have to look at one shard after that.

{
auto cct = static_cast<CephContext*>(ioctx.cct());
bool omap = false;
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you doing this just to clear this off the stack after the check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.


cls_log_entry e;
cls.timelog.prepare_entry(e, ut, {}, key, entry);
cls_log_add_prepare_entry(e, utime_t(ut), {}, key, entry);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't ut already a utime_t (hence the name)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's a real_time.

(I could change the name to make that clearer.)


class RGWDataChangesOmap final : public RGWDataChangesBE {
using centries = std::list<cls_log_entry>;
RGWSI_Cls& cls;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to clarify, is this just to delayer things?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exactly. Since RGWDataChangesLog has to have an IoCtx to pass to log_acquire_backing and friends, it seemed silly for the backends to have to make their own instead of it just being passed into them. Especially since svc_cls just created and dropped an IoCtx with every call, just having one and using it seemed preferable.

class RGWDataChangesFIFO final : public RGWDataChangesBE {
using centries = std::vector<ceph::buffer::list>;
std::vector<std::unique_ptr<rgw::cls::fifo::FIFO>> fifos;
dynarray<LazyFIFO> fifos;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice improvement

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. dynarray is something I've been wishing for for a long time to solve the 'dynamically sized array of things containing mutexes' problem.

@adamemerson
Copy link
Contributor Author

Copy link
Contributor

@mattbenjamin mattbenjamin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, good run

@smanjara
Copy link
Contributor

smanjara commented Dec 7, 2020

Looks good to me.


namespace ceph {
template<typename T, typename Allocator = std::allocator<T>>
class dynarray {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have you tried using ceph::containers::tiny_vector here? it was added for a similar purpose in #28987

ext_mime_map = nullptr;
}

void rgw_complete_aio_completion(librados::AioCompletion* c, int r) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like this was mostly copied from CB_AioCompleteAndSafe in AioCompletionImpl.h. i worry that if librados changes, this copy could be missed and lead to bugs. could we use CB_AioCompleteAndSafe directly instead?

librados::CB_AioCompleteAndSafe cb(c->pc);
cb(r);

@github-actions
Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@adamemerson adamemerson deleted the wip-mark-logbacking branch January 26, 2021 19:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants