*: don't mix Tracers in tests #73453
Merged
craig[bot] merged 7 commits intocockroachdb:masterfrom Dec 7, 2021
Merged
Conversation
Member
1bed4fc to
0c70d2f
Compare
0c70d2f to
a19fdfe
Compare
abarganier
approved these changes
Dec 7, 2021
Contributor
abarganier
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @aayushshah15)
Different tests are mixing Tracers within the same trace in various ways - meaning that a child span is being created with a different Tracer than the parent. This is not permitted - the Tracer catches it and panics trying to create the child. The tests were getting away with it because tracing is usually disabled (i.e. creating no-op spans doesn't complain about this condition). But if tracing is turned on (like I'm intending to do generally), these tests freak out. This patch fixes the tests by ensuring that Tracers are not mixed like this. One of the common ways in which tests mixing Tracers was by initiating an operation using a Tracer that the test created (e.g. through a Stopper), and then calling into a TestServer using that ctx). In such cases, the solution is to use the Server's Tracer when initiating the operation. There is a discussion to be had about whether this practice of mixing Tracers could be tolerated. The argument for not allowing it goes like this: - child spans have links to the parent and vice-versa, and these links are used to move spans in and out of the active spans registry when they Finish(). For example, a parent will insert all its still-alive children into the registry on Finish(). The registries are per-Tracer, and things get confusing if a span is created by Tracer A but ends up in Tracer B's registry. So actually mixing Tracers in a trace is probably indeed a bad idea. - We could avoid mixing Tracers even if the code attempts to use two of them. For example, if one does: p := trA.StarSpan() c := trB.StartSpan(WithParent(p)) we could imagine redirecting c's creation from trB to trA transparently. So far I've resisted doing so because, even though it'd technically work, it would become quite surprising/ambiguous which registry a span belongs to (e.g. one would reasonably expect c to be found in trB's registry, but it wouldn't). So far, I've attempted to avoid these surprises and disallow this pattern. In production there is one Tracer per Server, so these issues are moot (well, the issues come up in a DemoCluster, where there's multiple Servers, but I think I've fixed those). For tests, I recognize that they generally don't care about tracing at all because we haven't really made background tracing useful for tests (but maybe one day...). So I'll keep my ears open for pain and, if after this round of cleanup, the issue keeps popping up, I'll consider adding support for mixing. Release note: None
Different tests are mixing Tracers within the same trace in various ways - meaning that a child span is being created with a different Tracer than the parent. This is not permitted - the Tracer catches it and panics trying to create the child. The tests were getting away with it because tracing is usually disabled (i.e. creating no-op spans doesn't complain about this condition). But if tracing is turned on (like I'm intending to do generally), these tests freak out. See the first commit in the PR for more context. Release note: None
A couple of tests were using a cluster's stopper to operate on a particular node. This is pretty funky; the node has its own stopper that should be used to guard operations on the node. At the very least, this mix of stoppers was causing two different tracers to be involved in the respective operation, which might not be allowed in the future. Release note: None
Different tests are mixing Tracers within the same trace in various ways - meaning that a child span is being created with a different Tracer than the parent. This is not permitted - the Tracer catches it and panics trying to create the child. The tests were getting away with it because tracing is usually disabled (i.e. creating no-op spans doesn't complain about this condition). But if tracing is turned on (like I'm intending to do generally), these tests freak out. See the first commit in the PR for more context. Release note: None
Different tests are mixing Tracers within the same trace in various ways - meaning that a child span is being created with a different Tracer than the parent. This is not permitted - the Tracer catches it and panics trying to create the child. The tests were getting away with it because tracing is usually disabled (i.e. creating no-op spans doesn't complain about this condition). But if tracing is turned on (like I'm intending to do generally), these tests freak out. See the first commit in the PR for more context. Release note: None
This test was creating a tracing span for its operations for no good reason. I think years ago that span was used for debugging by setting up a tracing integration for this test. I bet it hasn't been used since. In fact, if tracing is enabled for this test, the test currently crashes because that span causes two different Tracers to be used within the same operation, which is currently not allowed. This patch gets rid of this problematic span. Release note: None
Before this patch, attempting to create a child with a different Tracer than the parent would panic is the parent was a "real span", but it wouldn't if the parent was a no-op. This happenstence was allowing tests to get away with mixing Tracers, as tests always run with tracing turned off. Now that all the tests that were mixing Tracers were hopefully fixed, this patch makes it illegal (panic) to mix Tracers even when tracingt is not enabled, for consistency of behavior. Release note: None
a19fdfe to
bdc36d4
Compare
andreimatei
commented
Dec 7, 2021
Contributor
Author
andreimatei
left a comment
There was a problem hiding this comment.
TFTR!
bors r+
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @aayushshah15)
Contributor
|
Build succeeded: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Different tests are mixing Tracers within the same trace in various
ways - meaning that a child span is being created with a different
Tracer than the parent. This is not permitted - the Tracer catches it
and panics trying to create the child. The tests were getting away with
it because tracing is usually disabled (i.e. creating no-op spans
doesn't complain about this condition). But if tracing is turned on
(like I'm intending to do generally), these tests freak out.
This patch fixes the tests by ensuring that Tracers are not mixed like
this. One of the common ways in which tests mixing Tracers was by
initiating an operation using a Tracer that the test created (e.g.
through a Stopper), and then calling into a TestServer using that ctx).
In such cases, the solution is to use the Server's Tracer when
initiating the operation.
There is a discussion to be had about whether this practice of mixing
Tracers could be tolerated. The argument for not allowing it goes like
this:
child spans have links to the parent and vice-versa, and these links
are used to move spans in and out of the active spans registry when
they Finish(). For example, a parent will insert all its still-alive
children into the registry on Finish(). The registries are per-Tracer,
and things get confusing if a span is created by Tracer A but ends up in
Tracer B's registry. So actually mixing Tracers in a trace is probably
indeed a bad idea.
We could avoid mixing Tracers even if the code attempts to use two of
them. For example, if one does:
p := trA.StarSpan()
c := trB.StartSpan(WithParent(p))
we could imagine redirecting c's creation from trB to trA
transparently. So far I've resisted doing so because, even though it'd
technically work, it would become quite surprising/ambiguous which
registry a span belongs to (e.g. one would reasonably expect c to be
found in trB's registry, but it wouldn't).
So far, I've attempted to avoid these surprises and disallow this
pattern.
In production there is one Tracer per Server, so these issues are moot
(well, the issues come up in a DemoCluster, where there's multiple
Servers, but I think I've fixed those). For tests, I recognize that they
generally don't care about tracing at all because we haven't really made
background tracing useful for tests (but maybe one day...). So I'll keep
my ears open for pain and, if after this round of cleanup, the issue
keeps popping up, I'll consider adding support for mixing.
Release note: None