tracing: fix OSD message breakage with backward compatibility#51288
tracing: fix OSD message breakage with backward compatibility#51288
Conversation
this is fixing backward compatibility issue that was introduced in commit: 5d4c5e6 clients will encode the open-telemetry traces only if server is REEF Signed-off-by: Yuval Lifshitz <ylifshit@redhat.com>
|
could you please let me know which test suites to run, so that cephfs and other client failures could be detected |
gregsfortytwo
left a comment
There was a problem hiding this comment.
This isn't complete for several reasons; please do the revert first so we aren't in a hurry trying to fix it and can do things properly.
(Reason 1: this doesn't fix the interplay of encoding versions, compat_version, etc.
Reason 2: Is this going into reef? Because it's not on that branch and it's getting late for breaking changes.
Reason 3: needs more discussion about performance implications and why this is going into the front of the payload.)
|
The patch above didn't go into reef, right? Using the reef flag doesn't seem right. I'm with @gregsfortytwo, please revert #47457 . |
|
Yep, sorry about that -- unfortunately it seems like we lost some institutional memory and now it got more (needed) attention. |
|
And where is the performance analysis? I didn't see it when scrolling through comments. |
|
@yuvalif The reef dev freeze was some time ago. The release is supposed to be quite soon. Changes to the client<->osd messaging layer are (to me, at least) very clearly too risky to casually backport this late in the testing process. The encoding issue uncovered here is a really good example of why. From the history on the PR, it looks like core should probably have taken a look much earlier and given advice on which bits needed to be fixed and how to test it. We'll have to work on that going forward. |
sorry. it is in a separate PR (where tracing was compiled in but disabled in the OSD code): comment with final 70K benchmark results on OSD only (not end2end): #44684 (comment) when tracing is disabled (the default) the difference between PR #44684 and PR#47457 (as replied to a similar question from @rzarzynski) is in boolean deserilization. please let me know if other perf tests are needed for the end2end case? |
|
removing the commits would create a huge change to the log, so i removed the commits using will keep this PR open (with DNM flag) for discussion on the correct way to fix the client-osd compatibility issue |
Please always create a PR with a revert inside; just removing a commit from the history (and force push...) is a no-no.
Yes, the PR was in the queue for a very long time and now we're going to add even more. I truly apologize for that. I believe there is a good reason. As Greg said, we lost the memory on making (and reviewing!) encoding changes. We need to reestablish it. I want to deeply understand the weeds to the last tiny detail and document them / create guidelines / checklists-for-review to prevent such problems in the future.
Because it tends to be risky. I recall that even the |
@gregsfortytwo: I think I know the reason. As this is all about op tracing, the new data are needed early, at least on If so, this requirement knocks out the new-extending-only-version trick. @yuvalif: please correct me if I'm wrong. |
i just copied whatever was done for the blkin tracing - which was added mid messages. |
I looked at the history, and the blkin tracing was added in "luminus", and was controlled by a feature flag when added. |
|
The comment is: encode_trace(payload, features);
// -- above decoded up front; below decoded post-dispatch thread --
encode(client_inc, payload); |
Hmm this probably happened when cleaning up the inter-server compatibility checks and the luminous feature flag, but we may want to bring it back for client compatibility reasons. IIRC there’s a mechanism for that (involving the MASK bits of encoding.h). |
Looks we'll need to dig deeper.
class MOSDOp : public MOSDFastDispatchOp {
static const int HEAD_VERSION = 8;
...
void encode_payload(uint64_t features) override {
...
} else {
// latest v8 encoding with hobject_t hash separate from pgid, no
// reassert version
header.version = HEAD_VERSION;
::encode(pgid, payload);
::encode(hobj.get_hash(), payload);
::encode(osdmap_epoch, payload);
::encode(flags, payload);
::encode(reqid, payload);
+ encode_trace(payload, features);
+
+ // -- above decoded up front; below decoded post-dispatch thread --
+
+void Message::encode_trace(bufferlist &bl, uint64_t features) const
+{
+#ifdef WITH_BLKIN
+ if (features & CEPH_FEATURE_BLKIN_TRACING)
+ ::encode(*trace.get_info(), bl);
+#endif
+}
void Message::encode_trace(bufferlist &bl, uint64_t features) const
{
-#ifdef WITH_BLKIN
- if (features & CEPH_FEATURE_BLKIN_TRACING)
- ::encode(*trace.get_info(), bl);
-#endif
+ auto p = trace.get_info();
+ static const blkin_trace_info empty = { 0, 0, 0 };
+ if (!p) {
+ p = ∅
+ }
+ ::encode(*p, bl);
}
|
i think that at some point we ran out of feature bits, so we changed to use "release bits" like "LUMINUS_SERVER", and repurposed some of the bits taken earlier. |
|
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
After looking into the weeds this seems a bit more complex:
I haven't profiled the code, so I don't know whether this is a real problem. From my gut feelings: I doubt. However, if we want to be paranoid, we could move the is tracing valid? decision to |
|
We really can't use the previous blkin tracing as a model for how feature bits should be used — it was always broken. As @jdurgin noted in June 2017, commit 2525609:
Because CEPH_FEATURE_BLKIN_TRACING was never added to CEPH_FEATURES_ALL — it got a CEPH_FEATURES_BLKIN which was #define'd as 0. So this was always a problem but it apparently got shielded by other feature bits and the struct encoding at the same time. :/ In terms of feature bit re-use, I never really grokked it properly but there's stuff around CEPH_FEATUREMASK and the HAVE_FEATURE MACRO that is supposed to handle the full check and make compatibility work across those bit deprecations etc. |
so, please let me know what you think would be best: |
will give it a try |
|
waiting for this commit: 1049d3e |
|
@athanatos, @gregsfortytwo, @vshankar, @cbodley |
|
closing in favor of: #52114 |
this is fixing backward compatibility issue that was introduced in commit: 5d4c5e6 clients will encode the open-telemetry traces only if server is REEF
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "pacific"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windows