Skip to content

librbd: Zipkin tracing [GSOC 2016]#10637

Closed
vears91 wants to merge 33 commits intoceph:masterfrom
vears91:wip-rebase-gsoc
Closed

librbd: Zipkin tracing [GSOC 2016]#10637
vears91 wants to merge 33 commits intoceph:masterfrom
vears91:wip-rebase-gsoc

Conversation

@vears91
Copy link

@vears91 vears91 commented Aug 9, 2016

Pass trace information from librbd to librados. This is part of the end-to-end performance visualization Google Summer of Code project, to initialize traces in fio rbd engine and pass them to librados. It is built on the wip-blkin-osdc branch that added Zipkin tracing support to Messenger and Objecter.

Trace information is passed to librbd through new rbd_aio_write_traced and rbd_aio_read_traced functions, that take trace information as one of the parameters.

Signed-off-by: Victor Araujo ve.ar91@gmail.com

agshew and others added 19 commits August 5, 2016 02:31
 * Adds --with-blkin to autoconf (default without)
 * Adds BLKIN_LIBS to necessary automake files
 * Adds documentation for testing BlkKin tracing

Signed-off-by: Andrew Shewmaker <agshew@gmail.com>
Signed-off-by: Marios-Evaggelos Kogias <marioskogias@gmail.com>
Signed-off-by: Chendi.Xue <chendi.xue@intel.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
zipkin_trace.h is a wrapper around ztracer.hpp, which provides a stub
implementation when WITH_BLKIN is not defined

Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
when set, Message::decode_trace() will always create a trace for
incoming messages, even if they didn't pass trace information

Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Victor Araujo <ve.ar91@gmail.com>
A rbd_aio_write_traced function was created in the RBD API to be used from fio rbd engine.

A  blkin_trace_infomation* field was added to AioImageRequest, which passes the trace information to the cache requests. A similar field was added to the Bufferhead to be able to pass the trace information when an AioObjectRequest is created.

Signed-off-by: Victor Araujo <ve.ar91@gmail.com>
vh4x and others added 5 commits August 10, 2016 11:26
Signed-off-by: Victor Araujo <ve.ar91@gmail.com>
These changes add an optional blkin_trace_info* parameter for aio_operate and initializes a trace if the trace information is present in librados::IoCtxImpl::aio_operate.

Signed-off-by: Victor Araujo <ve.ar91@gmail.com>
A rbd_aio_read_traced function was added to librbd be able to pass trace information from clients to rbd. The trace information is passed through librbd to librados.

Signed-off-by: Victor Araujo <ve.ar91@gmail.com>
The template changes in AioObjectRequests are incorporated and trace information is passed along.

Signed-off-by: Victor Araujo <ve.ar91@gmail.com>
Signed-off-by: Victor Araujo <ve.ar91@gmail.com>
Signed-off-by: Victor Araujo <ve.ar91@gmail.com>
@alimaredia alimaredia self-assigned this Aug 10, 2016
@alimaredia
Copy link
Contributor

@dillaman

@alimaredia
Copy link
Contributor

5b5d443 might be worth removing. The only part of it that could be useful is the .rst file that gets created, everything else should be removed.

#define CEPH_FEATURE_OSD_HITSET_GMT (1ULL<<54)
#define CEPH_FEATURE_HAMMER_0_94_4 (1ULL<<55)
#define CEPH_FEATURE_NEW_OSDOP_ENCODING (1ULL<<56) /* New, v7 encoding */
#define CEPH_FEATURE_BLKIN_TRACING (1ULL<<57) // enabled by ifdef, don't overlap
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was intended to have its own feature bit, rather than share a bit with other flags. the idea was to allow endpoints to interoperate if one was built with blkin support, and the other was not. but i'm not sure that's actually useful - if we're including blkin as a submodule, we might as well enable and package it by default

so i think we should get rid of the conditional CEPH_FEATURES_BLKIN stuff below, and just update the value of this bit. i'm guessing that we want to overlap with the KRAKEN bit (can @athanatos confirm?):

#define CEPH_FEATURE_BLKIN_TRACING CEPH_FEATURE_SERVER_KRAKEN

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can include this with the kraken feature bit if blkin will always be enabled. However, if we make blkin optional in the build there are potential issues with upgrades when some endpoints have blkin compiled in and others do not. E.g. a post-kraken osd without blkin and a kraken osd with blkin - one sends no tracing info, the other assumes tracing info is there based on the KRAKEN bit and can't decode the non-blkin message.

Keeping it as a separate feature bit that's only turned on in connections where it's compiled in, like this diff shows, seems to avoid this kind of problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jdurgin okay, thanks!

@vears91 could you please update the value of CEPH_FEATURE_BLKIN_TRACING here to the first unused bit you can find?

@cbodley
Copy link
Contributor

cbodley commented Aug 10, 2016

5b5d443 might be worth removing

i'd prefer to keep it, at least in some form, so the original author/sign-offs remain in the git history

url = https://github.com/ceph/isa-l
[submodule "src/blkin"]
path = src/blkin
url = https://github.com/linuxbox2/blkin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we move this under the ceph github org? I can create the repo there

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vear91 added 2 commits August 11, 2016 16:07
…ibrbd.cc

The tests follow the same strategy as rbd_aio_write and rbd_aio_read. These tests check that the call doesn't fail but tracing functionalities would need to be checked with LTTng.

Signed-off-by: Victor Araujo <ve.ar91@gmail.com>
Signed-off-by: Victor Araujo <ve.ar91@gmail.com>
int aio_operate(const std::string& oid, AioCompletion *c,
ObjectWriteOperation *op, snap_t seq,
std::vector<snap_t>& snaps);
std::vector<snap_t>& snaps, const blkin_trace_info *trace_info = nullptr);
Copy link

@dillaman dillaman Aug 15, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This breaks the current API/ABI (e.g. an older app linked against this would not have put a null pointer on the stack). Same comment below.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it be fixed by adding an aio_operate_traced function in librados, and calling it from AioObjectRequest if there is trace information, or else defaulting to the regular aio_operate?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since aio_operate is already overloaded many times over, you could just add a new overload which librbd would always call.

@dillaman
Copy link

I am having trouble understanding the life cycle of blkin_trace_info. I am assuming it's the caller's responsibility to allocate one of these objects, but when would it get freed? In the librbd case with the client-side cache enabled, you cannot free it when the read/write's associated AioCompletion fires since it might still be held in ObjectCacher. You most likely would need to implement reference counting.

@cbodley
Copy link
Contributor

cbodley commented Aug 15, 2016

@dillaman struct blkin_trace_info is just 3 integers, and should be stack allocated and copied, rather than allocated on the heap. the same goes for struct blkin_trace and class Trace

@dillaman
Copy link

dillaman commented Aug 15, 2016

@cbodley (1) right now it is being passed around as a pointer and (2) if it should be passed by value would we ever need to worry about the blkin_trace_info structure changing since it will now be part of the librados/librbd API as pass by value?


if (osd_op) {
osd_op->mark_event("commit_queued_for_journal_write");
osd_op->journal_trace.init("journal", &trace_endpoint, &osd_op->store_trace);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this journal_trace.init() should be conditional on if (osd_op->store_trace) instead of just if (osd_op), or we'll create separate traces that aren't attached to a parent

(the call to osd_op->mark_event() should remain outside of if (osd_op->store_trace))

vear91 and others added 5 commits August 15, 2016 17:42
… info

This avoids breaking the API for older apps that are linked against it.

Signed-off-by: Victor Araujo <ve.ar91@gmail.com>
the blkin trace information needs to be present after initial
call to decode_payload()

Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Victor Araujo <ve.ar91@gmail.com>
Signed-off-by: Victor Araujo <ve.ar91@gmail.com>
Calling aio_operate without trace information caused problems in the rbd
unit tests.

Signed-off-by: Victor Araujo <ve.ar91@gmail.com>
Context *m_completion;
Extents m_parent_extents;
bool m_hide_enoent;
const blkin_trace_info *m_trace_info;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as @dillaman pointed out, it's not safe to hold on to the trace info pointers like this

take a look at how Objecter and Op work for an example; they pass optional traces around by ZTracer::Trace*, and store them by value as ZTracer::Trace. we should only be using blkin_trace_info in our public interface (anything in src/include/rados/), then turning it into a Trace object as soon as we can

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wondered why ZTracer::Trace wasn't being used but I was told to pass the trace info to send_write_op. I will try to change it, can you please explain more about the issue with the trace info pointers? What problems can it cause?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking at your aio_write_test_data_traced() test as an example, it's starting the rbd write and waiting for the completion at the same time:

void aio_write_test_data_traced(...) {
  struct blkin_trace_info trace_info;
  rbd_aio_write_traced(image, off, len, test_data, comp, &trace_info);
  rbd_aio_wait_for_complete(comp);

we're passing a pointer to a blkin_trace_info that's on the stack here. but it's not a problem, because the stack stays valid until the operation completes - this operation is basically synchronous. but consider an asynchronous case, where we start the operation in one function, and wait for it later in another:

void start_write(...) {
  struct blkin_trace_info trace_info;
  rbd_aio_write_traced(image, off, len, test_data, comp, &trace_info);
}
void wait_for_write(...) {
  rbd_aio_wait_for_complete(comp);
}

again we pass a pointer to blkin_trace_info on the stack, but this time it goes out of scope right after it starts the write. so if rbd holds on to this pointer and tries it dereference it later, it won't be pointing to valid memory. that's why we need to store a copy of the trace info instead of just the pointer

A ZTracer::Trace is generated and stored by value in AioImageRequest.
ZTracer::Trace* is passed around instead of blkin_trace_info*

Signed-off-by: Victor Araujo <ve.ar91@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants