rgw/tracing: end to end trace for put obj op#47457
Conversation
|
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
e0fcbc0 to
b665f19
Compare
|
jenkins retest this please |
fb40ddc to
166a146
Compare
| } | ||
| op->osd_parent_span = tracing::osd::tracer.start_trace("op-request-created"); | ||
|
|
||
| if (m->otel_trace.IsValid()) { |
There was a problem hiding this comment.
don't you need to check that m->otel_trace is not null?
There was a problem hiding this comment.
don't you need to check that
m->otel_traceis not null?
no. the otel_trace in the msg is not a pointer. it's the trace info which initialized to noop span context. in case we recieve an otel trace info in the msg protocol, we decode it to that span context.
src/rgw/rgw_sal_store.h
Outdated
| virtual bool have_instance(void) override { return state.obj.key.have_instance(); } | ||
| virtual void clear_instance() override { state.obj.key.instance.clear(); } | ||
| virtual jspan_context& get_trace() { return trace_ctx; } | ||
| virtual void set_trace (jspan_context&& _trace_ctx) { trace_ctx = _trace_ctx; } |
There was a problem hiding this comment.
why do you do "copy" here and "move" in: src/rgw/rgw_sal_filter.h
There was a problem hiding this comment.
This is a move call (&&), so it shouldn't need std::move. I don't know about the filter version, though, because it's calling through.
There was a problem hiding this comment.
the filter also accept jspan_context&& as a parameter
|
jenkins retest this please |
|
jenkins test make check |
5a07bc5 to
359b25f
Compare
|
jenkins test make check |
07f3bc0 to
2ffd073
Compare
|
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
ebc6065 to
e7506af
Compare
…telemetry * build: add opentelemetry to cmake system crimson targets that uses Message.cc/h are built before opentelemetry (o-tel), so we need to build o-tel eralier so we also add the library to the include path earlier this shoud work for WITH_JAEGER flag both the ON/OFF cases, and for librados where the compilation flag is ignored * msg/tracer: add o-tel trace to Messages with decode/encode function in tracer.h some files that uses Message.cc/h just need the encode/decode functions and not all others functions. some crimson targets does not link with ceph_context (common) which is required for tracer.cc file. so we just need to include that functions * librados: Add opentelemtry trace param for aio_operate and operate methods in order to propagate the trace info I added the otel-trace as an extra param. in some places, there already was a blkin trace info, and since it is not used in other places we can safely change it to o-tel trace info. this will be done in another commit, so the cleanup of blkin trace will be in a dedicated commit * osd: use the o-tel trace of the msg as a parent span of the osd trace if there is a valid span in the msg, we will add this op to the request trace, otherwise it will start a new trace for the OSD op * rgw: pass put obj trace info to librados in order to make it possible, I saved the trace info inside the sal::Object, so we can use it later when writing the object to rados it could be used also later for read ops. note the trace field of req_state is initalized only in rgw_process, so it's also required in librgw request flow Signed-off-by: Omri Zeneva <ozeneva@redhat.com>
e7506af to
5d4c5e6
Compare
|
Hey @yuvalif, it's in the yuri11 batch! |
|
Rados suite review: Failures, unrelated: Details: |
|
@yuvalif all looks good on the rados side! |
|
jenkins test windows |
|
jenkins test make check arm64 |
|
BTW, this PR caused the cephfs upgrade client test failures https://tracker.ceph.com/issues/59534. |
|
This PR will also make the latest clients won't be compatible with the |
what are the compatibility requirements between the cephfs clients and the OSDs? |
The osds in old ceph will always drop all the osd reqs. With this PR the That means in the osd side the osd requests will always be something like More detail please see the tracker: https://tracker.ceph.com/issues/59534 |
In cephfs we have already make this case to work smoothly. In the new client side we will check the features bits supported or version first and then encode the requests accordingly. Maybe here can do it like cephfs ? |
| class MOSDOp final : public MOSDFastDispatchOp { | ||
| private: | ||
| static constexpr int HEAD_VERSION = 8; | ||
| static constexpr int HEAD_VERSION = 9; |
There was a problem hiding this comment.
@yuvalif @rzarzynski - This change seems to be causing failures in some fs suite runs where a new(-er) cephfs client talks to old cluster daemons. The client (ceph-fuse) is upgraded to reef (main branch) while the other ceph daemons are nautilus. Details are captured in this tracker: https://tracker.ceph.com/issues/59534
The upgraded client sends an osd_op with (version) v9. The OSD knows to decode upto v8 and would silently drop the request. This seems to be the case with pacific too: https://github.com/ceph/ceph/blob/pacific/src/messages/MOSDOp.h#L402
There was a problem hiding this comment.
i don't see this change in reef. this is currently only on main. see HEAD_VERSION = 8 there https://github.com/ceph/ceph/blob/reef/src/messages/MOSDOp.h#L39
so, my suggestion is to wait for PR: #48971, and revert the version back to 8 - by repurposing the blkin trace entry to send otel traces
There was a problem hiding this comment.
by repurposing the blkin trace entry to send otel traces
doesn't that still change the format of the encoding? i don't think we can reuse version 8 if the contents are different
There was a problem hiding this comment.
there could be a workaround that by using some magic number to distinguish the 2 types of payloads, and decode and throw away if its not there.
on a different note, @vshankar, what happened in the past when we had to change the message format?
There was a problem hiding this comment.
This is on ceph/main and not on ceph/reef, correct? This is a RADOS change, it's your call whether to revert first. Either way, let's find a coherent and testable change that can replace this. This feature has been in progress for a couple of years, if I'm not mistaken.
There was a problem hiding this comment.
reef isn't affected:
- rgw/tracing: end to end trace for put obj op #47457 (comment) and also
- https://github.com/ceph/ceph/blame/reef/src/messages/MOSDOp.h#L39.
Let's revert to not hot-patch-in-a-hurry an encoding problem, then get into the weeds and arrange a meeting / workshop on the encoding stuff to finally merge the redo of this PR.
There was a problem hiding this comment.
No, that doesn't work. You need to guarantee compatibility of the messages at the protocol level, not via a config option that can be done wrong or that can have different values in different places. The RADOS team does this regularly and has a lot of models they should be able to share.
Given how broken the protocol changes here are, I really think we need to revert the whole PR until they can be hashed out properly — doing a quick fix is only going to come back and bite us. :( Not sure how you missed this @rzarzynski the first time around, but can you quickly provide some pointers?
I think I might have misunderstood the earlier conversation -- a feature bit would be a way to resolve this and might be the right one (I'm hoping we don't need to burn a bit that way, but if we need to we will). A config option would be very bad. :)
There was a problem hiding this comment.
Right - I was pointing out for using a feature bit and then choose the encoding versions accordingly.
Anyway, +1 for reverting which cannot be done via the github UI for this change. I guess @yuvalif is on it and the revert will be merged soon...
| encode(flags, payload); | ||
| encode(reqid, payload); | ||
| encode_trace(payload, features); | ||
| encode_otel_trace(payload, features); |
There was a problem hiding this comment.
There's the conversation about the encoding breaking things, and this here is the fundamental issue — you just shoved a new data member into the middle of an encoding, which is an absolutely incompatible change, and didn't do anything to make sure all the various endpoints involved were able to keep communicating.
I'm guessing this was modeled after 6159027, but that looks to have been riding along on an existing check for the SERVER_LUMINOUS flag to keep things okay (or else it was broken and fixed up elsewhere), which this PR doesn't get to take advantage of.
There was a problem hiding this comment.
yes. will post a similar fix that checks on the SERVER_REEF flag.
however, in commit 6159027 the feature bit used in several files, but not in: MOSDOp.h . what is the reason for that? where should the fix be added?
There was a problem hiding this comment.
created a fix here: #51288
if fix is acceptable, will squash into one commit
|
I'd like to revert this until there's a testable fix to merge. |
done here: #51293 |
Following #47370
The goal of this PR is to unify the trace created for put obj operation in the RGW and the subsequent spans created for the same operation in OSDs
Librados already had the blkin_trace_info in some places, but this tracing system isn't used either in RGW or OSD, so we can replace it with open-telemetry trace information.
for now, I just added an extra argument for the relevant methods. cleanup for blkin stuff will be covered in another commit. since it involves changes to many different source files, so we want to make this PR cleaner and leave the cleanup to another PR.
Screenshot of how the putobj trace looks like after this change:
Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windows