Skip to content

rgw/tracing: end to end trace for multipart upload #47370

Closed
zenomri wants to merge 7 commits intoceph:mainfrom
zenomri:wip-omri-full-multipart-trace
Closed

rgw/tracing: end to end trace for multipart upload #47370
zenomri wants to merge 7 commits intoceph:mainfrom
zenomri:wip-omri-full-multipart-trace

Conversation

@zenomri
Copy link

@zenomri zenomri commented Jul 31, 2022

The goal of this PR is to unify the trace created for MP upload in the RGW and the subsequent spans created for the same operation in OSDs

Librados already had the blkin_trace_info in some places, but this tracing system isn't used either in RGW or OSD, so we can replace it with open-telemetry trace information.

The blkin trace needs to be removed from all over the place. It needs to be kept for another PR to simplify the current changes made for end-to-end MP tracing.

Screenshot of how the MP trace looks like after this change:

image

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

Omri Zeneva added 6 commits July 31, 2022 03:35
new o-tel trace added for the message header, and encode/decode it for MSOD op

Signed-off-by: Omri Zeneva <ozeneva@redhat.com>
in order to propagate the trace info I added the otel-trace as an extra param. in some places, there already was a blkin trace info, and since it is not used in other places we can safely change it to o-tel trace info.

Signed-off-by: Omri Zeneva <ozeneva@redhat.com>
… aio_opearte

in order to make it possible, I saved the trace info inside the sal::Object, so we can use it later when writing the object to rados
it could be used also later for read ops

Signed-off-by: Omri Zeneva <ozeneva@redhat.com>
Signed-off-by: Omri Zeneva <ozeneva@redhat.com>
… of the span

Signed-off-by: Omri Zeneva <ozeneva@redhat.com>
Signed-off-by: Omri Zeneva <ozeneva@redhat.com>
@github-actions
Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved


s->trace->SetAttribute(tracing::rgw::UPLOAD_ID, multipart_upload_id);
multipart_trace = tracing::rgw::tracer.add_span(name(), upload->get_trace());
s->object->set_trace(multipart_trace->GetContext());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets start from regular object trace. please move out of the "if multipart" block, and set the regular trace as its parent

reqid.name._num, reqid.tid, reqid.inc);
}
op->osd_parent_span = tracing::osd::tracer.start_trace("op-request-created");
op->osd_parent_span = tracing::osd::tracer.add_span("op-request-created", m->otel_trace);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please keep current behavior. if m->otel_trace is null, just start a new trace

::ObjectOperation *o, AioCompletionImpl *c,
const SnapContext& snap_context, int flags,
const blkin_trace_info *trace_info)
const jspan_context *trace_info)
Copy link
Contributor

@yuvalif yuvalif Jul 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please keep blkin until we deprecate it completely

int aio_operate(const object_t& oid, ::ObjectOperation *o,
AioCompletionImpl *c, const SnapContext& snap_context,
int flags, const blkin_trace_info *trace_info = nullptr);
int flags, const jspan_context *trace_info = nullptr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please keep blkin for now

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assume you mean we can add the jspan_context, but leave blkin for later removal? I don't want us to have to do an unsafe cast to/from a blkin pointer type

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removing blkin separately somehow feels cleaner, but at the same time, I've confirmed with core that blkin can be unceremoniously removed, just saying

Copy link
Contributor

@yuvalif yuvalif Aug 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assume you mean we can add the jspan_context, but leave blkin for later removal? I don't want us to have to do an unsafe cast to/from a blkin pointer type

agree. lets add another function (for now)

removing blkin separately somehow feels cleaner, but at the same time, I've confirmed with core that blkin can be unceremoniously removed, just saying

yes, buts lets remove that in a separate commit, including the removal of all functions that use it, remove it from the git submodules or OS dependencies, make etc.
the only place we should keep it (or a place holder of it) is the wire protocol, so we won't crash if someone sends a message with a blkin trace.

…putobj operations

this commit also include changes for the librados interfaces to support the otel trace info

Signed-off-by: Omri Zeneva <ozeneva@redhat.com>

rgw_placement_rule *pdest_placement = &s->dest_placement;

s->object->set_trace(s->trace->GetContext());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if s->trace is null?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this scenario is not possible. s->trace is initialized unconditionally before the op execution(in rgw_process.cc)

in case we compile with WITH_JAEGER=OFF then s->trace won't be a ptr, but a stub struct (with -> overloading)

@zenomri
Copy link
Author

zenomri commented Aug 4, 2022

thank for the review. I will open a new PR with the right new branch name (trace for Put op instead of Multipart)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants