tracer/osd/librados/build/rgw: rgw and osd end2end tracing using opentelemetry#52114
tracer/osd/librados/build/rgw: rgw and osd end2end tracing using opentelemetry#52114rzarzynski merged 2 commits intoceph:mainfrom
Conversation
|
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
src/common/tracer.h
Outdated
| // shoudl be called between ENCODE_START, ENCODE_FINISH pair | ||
| inline void nested_encode(const jspan_context& span_ctx, bufferlist& bl, uint64_t f = 0) { | ||
| auto is_valid = span_ctx.IsValid(); | ||
| ceph::encode(is_valid, bl); |
There was a problem hiding this comment.
Doesn't this change break the existing encoding?
There was a problem hiding this comment.
@athanatos this specific change is part of an optimization to avoid parsing a struct when encoding is disabled (by taking the boolean out of the struct). see: ea8f7fb
but regardless of that optimization (that we can remove), adding the new trace struct in the middle of the message, breaks existing encoding.
to solve this issue I'm wating for "squid" feature bit to be merged: #53191
|
This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days. |
|
unstale please |
|
@yuvalif we need this for squid right? |
yes |
cc4823b to
a7e36b8
Compare
|
|
jenkins test make check |
|
jenkins test windows |
|
|
jenkins test make check |
|
jenkins test windows |
|
jenkins test windows |
|
@yuvalif @ljflores This was tested and approved |
|
jenkins test windows |
|
Yet another one: |
|
OK, going to backport just after cutting squid. |
|
The windows check failed due to timeout: I don't see a rebase-to-fresh-main happening before building the binaries. I'm going go through the check with a rebased commit. |
|
Hi, I'm currently preparing a build with this patch and I'll take a closer look. For what is worth, Windows builds use |
|
I was right, it's a message decoding issue, similar to the one that we had before. We're getting this on the OSD side: It looks like this patch (re)introduces a regression that affects compatibility between builds that have telemetry enabled and the ones that don't. The previous PR got reverted: #51331 |
|
In my understanding:
#ifdef HAVE_JAEGER
// ...
void encode(const jspan_context& span, ceph::buffer::list& bl, uint64_t f = 0);
void decode(jspan_context& span_ctx, ceph::buffer::list::const_iterator& bl);
// ...
#else // !HAVE_JAEGER
// ...
inline void encode(const jspan_context& span, bufferlist& bl, uint64_t f=0) {}
inline void decode(jspan_context& span_ctx, ceph::buffer::list::const_iterator& bl) {}
}
#endif // !HAVE_JAEGER
inline void encode(const jspan_context& span_ctx, bufferlist& bl, uint64_t f = 0) {
ENCODE_START(1, 1, bl);
// jaeger is missing, set "is_valid" to false.
bool is_valid = false;
encode(is_valid, bl);
ENCODE_FINISH(bl);
}What is important here is that
-#include "include/buffer.h"
+#include "include/encoding.h"
As this PR also includes |
We aren't currently using jaeger tracing on Windows. The issue is that Windows hosts (or any other host that doesn't use jaeger) are experiencing message decoding failures after a recent change [1]. This change updates the tracer encoding so that messages from non-jaeger hosts may be decoded by services that use jaeger. [1] ceph#47457 Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com> This commit rebrings 3701ffa which got reverted due to an implicit dependency with other revert. Please see ceph#52114 (comment). Conflicts: src/common/tracer.h formatting conflict with 7179ac0
|
thanks @rzarzynski and team for all the help and support! |
We aren't currently using jaeger tracing on Windows. The issue is that Windows hosts (or any other host that doesn't use jaeger) are experiencing message decoding failures after a recent change [1]. This change updates the tracer encoding so that messages from non-jaeger hosts may be decoded by services that use jaeger. [1] ceph#47457 Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com> This commit rebrings 3701ffa which got reverted due to an implicit dependency with other revert. Please see ceph#52114 (comment). Conflicts: src/common/tracer.h formatting conflict with 7179ac0 (cherry picked from commit e145264)
|
The squid backport is: #55625. |
| auto async_operate(ExecutionContext& ctx, IoCtx& io, const std::string& oid, | ||
| ObjectReadOperation *read_op, int flags, | ||
| CompletionToken&& token) | ||
| CompletionToken&& token, const jspan_context* trace_ctx = nullptr) |
There was a problem hiding this comment.
noticed while rebasing #55592 that this argument is unused
no ObjectReadOperation overload of aio_operate() takes a trace pointer. the jspan_context overloads were only added for ObjectWriteOperations
We aren't currently using jaeger tracing on Windows. The issue is that Windows hosts (or any other host that doesn't use jaeger) are experiencing message decoding failures after a recent change [1]. This change updates the tracer encoding so that messages from non-jaeger hosts may be decoded by services that use jaeger. [1] ceph#47457 Signed-off-by: Lucian Petrut <lpetrut@cloudbasesolutions.com> This commit rebrings 3701ffa which got reverted due to an implicit dependency with other revert. Please see ceph#52114 (comment). Conflicts: src/common/tracer.h formatting conflict with 7179ac0
manual testing for end2end tracing is described here: https://gist.github.com/yuvalif/c013710c537e96ba5d301b1bce512240
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "pacific"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windows