tech-notes: enhance the context tech note#73940
tech-notes: enhance the context tech note#73940knz wants to merge 2 commits intocockroachdb:masterfrom
Conversation
8c29d04 to
8e3d534
Compare
andreimatei
left a comment
There was a problem hiding this comment.
I'm glad this doc is getting love.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @knz, @nvanbenschoten, and @tbg)
docs/tech-notes/contexts.md, line 3 at r2 (raw file):
# `context.Context` and how it relates to logging and tracing in CockroachDB Original author: Andrei Matei
want to add yourself here?
docs/tech-notes/contexts.md, line 5 at r2 (raw file):
Original author: Andrei Matei <!-- markdown-toc start - Don't edit this section. Run M-x markdown-toc-refresh-toc -->
did you want this emacsism in here?
docs/tech-notes/contexts.md, line 95 at r2 (raw file):
- to identify components in logs and traces. - to palliate a shortcoming in the context abstraction.
good word
docs/tech-notes/contexts.md, line 121 at r2 (raw file):
This is achieved via the `AmbientContext` abstraction: an ambient context, or AC for short, is a data structure that carries the common information that we want to share across all asynchronous goroutines
I wouldn't insist too much on the "asynchronous" aspect of any tasks here. It's less common, but "synchronous" can also also want to be uncancelable. You've already said something about "spawning async tasks" above and I think that's fine cause it's suggestive, but if I were writing this, I wouldn't insist on it.
As you'll see below, I question this whole section - so don't do fixes eagerly.
docs/tech-notes/contexts.md, line 132 at r2 (raw file):
When a component needs to use a “fresh” `context.Context`, without being encumbered by the cancellation policy of another context, it can
nit:s/cancellation policy/cancellation - be precise; the ctx cancellation means something clear that people understand, the "policy" seems nebulous
docs/tech-notes/contexts.md, line 133 at r2 (raw file):
When a component needs to use a “fresh” `context.Context`, without being encumbered by the cancellation policy of another context, it can use `context.Context()` then request the nearest AC instance to *annotate* this context.
you mean context.Background() ?
docs/tech-notes/contexts.md, line 133 at r2 (raw file):
When a component needs to use a “fresh” `context.Context`, without being encumbered by the cancellation policy of another context, it can use `context.Context()` then request the nearest AC instance to *annotate* this context.
I don't really agree with this paragraph, and more generally to the idea that the AmbientCtx is particularly related to terminating a cancellation signal. When you want to terminate a cancellation, there's a better way to do it:
uncanceledCtx = logtags.WithTags(ctx, logtags.FromContext(ctx))
instead of
uncanceledCtx = ambientCtx.Annotate(context.Background())
The former will get you better tags, and also not require you to have an AmbientCtx handy.
In fact, I think there's an even better way - a way to copy everything except the cancellation, implemented in 6441270
Tobi doesn't like it, and we've been arguing about it in various skirmishes (last one here #73554 (comment)).
In any case, I don't think the AmbientCtx is generally relevant in discussion about cancellation.
docs/tech-notes/contexts.md, line 139 at r2 (raw file):
In any case, it is very important to use ambient contexts as much as
possible, to ensure that the common information is effectively
What is this "effectively shared" referring two? The sharring of a logtags.Buffer? I wouldn't particularly talk about such implementation details.
I also wouldn't use big words like "very important" :) . I think we should tell people that AmbientContexts are a popular way of getting log tags, and that the node id tag in particular is important in tests (and beyond, due to cockroach debug merge-logs), and leave it at that.
If I were to nitpick further, this paragraph also seems out of place to me. This section is supposed to be all terminating cancellation, right? And this paragraph is about the importance of log tags. But log tags are generally discussed elsewhere.
docs/tech-notes/contexts.md, line 392 at r2 (raw file):
For this, we associate that component instance with its own `AmbientContext`, and pre-populate it with the tags that each goroutine spawned by that operation should get in its `Context`.
I think perhaps there's some confusion here: AmbientCtx is not particularly aboutn async work of "spawned goroutines". If anything, it is more about synchronous operations flowing through a component (for example, a store). The store wants to add the storeid to anything passing through the store, not just to goroutines spawned by the store.
docs/tech-notes/contexts.md, line 535 at r2 (raw file):
log tag: ctx = logtags.AddTag(ctx, "range-lookup", nil)
s/nil/nil /*value */ - to give a hint to people what that is
docs/tech-notes/contexts.md, line 662 at r2 (raw file):
Note that when invoking `context.Background()`, it's likely that the first thing you'll want is to *annotate* it with the nearest `AmbientContext` to link it to the nearest tracer instance
I personally wouldn't talk about the Tracer here. AnnotateCtxWithSpan() is used much less then AnnotateCtx, probably for good reason. So, at the very least, don't put the reference to the tracer first, and qualify it to be clear that the "association with the Tracer" only happens if you ask for a Span. I would also just talk about spans, and not about the Tracer.
docs/tech-notes/contexts.md, line 713 at r2 (raw file):
| (1) just tag | Yes | Not guaranteed (only if incoming ctx was already annotated) | No | | (2) annotate | No | Yes | No | | (3) sub-span | No | No | Yes |
well, if you combine "logs/traces" like you do in this column, then the answer to (3) sub-span - Identifies the current operation in logs/traces should be yes. Creating a span is the biggest way to identify an operation in a trace.
docs/tech-notes/contexts.md, line 729 at r2 (raw file):
(NB: annotation without creating a tracing span is idempotent.) An exception to this is a network handler function inside a
nit: I wouldn't call this "an exception"; it's consistent with the previous paragraph.
docs/tech-notes/contexts.md, line 758 at r2 (raw file):
behavior, so that we can easily distinguish each iteration in traces. This also explains why we have shortcuts for (1)+(2) and (2)+(3):
what is the shortcut for 1+2 that you're referring to?
andreimatei
left a comment
There was a problem hiding this comment.
btw, one area where I think we've been inconsistent (and I personally have gone back and forth), and the note doesn't really address is whether the tags in AmbientContexts should be overlapping or not. Like, should the store's AmbientCtx have the node id in it? That's not necessary when the store annotates contexts coming from the node - it probably leads to duplicate tags that are de-duped somewhere. But it is necessary if the store wants to annotate context.Background(). I'd be curious what y'all think the pattern for creating AmbientContexts should be.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @knz, @nvanbenschoten, and @tbg)
Release note: None
knz
left a comment
There was a problem hiding this comment.
whether the tags in AmbientContexts should be overlapping or not.
I thought that did not matter, because ...
it probably leads to duplicate tags that are de-duped somewhere
... the logtags package thoroughly de-duplicates, all the time, and we rely on this logic pretty much everywhere (not just in AmbientContext).
I'd be curious what y'all think the pattern for creating AmbientContexts should be.
Personally, I see the AmbientContext in Store as something derived from the one in Node, with more tags.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @nvanbenschoten, and @tbg)
docs/tech-notes/contexts.md, line 3 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
want to add yourself here?
sure
docs/tech-notes/contexts.md, line 5 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
did you want this emacsism in here?
doesn't hurt
docs/tech-notes/contexts.md, line 95 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
good word
do you want me to change it?
docs/tech-notes/contexts.md, line 121 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
I wouldn't insist too much on the "asynchronous" aspect of any tasks here. It's less common, but "synchronous" can also also want to be uncancelable. You've already said something about "spawning async tasks" above and I think that's fine cause it's suggestive, but if I were writing this, I wouldn't insist on it.
As you'll see below, I question this whole section - so don't do fixes eagerly.
I understand your viewpoint now. I excised most of the phrasing from this section, to keep only the most evident points.
docs/tech-notes/contexts.md, line 132 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
nit:s/cancellation policy/cancellation - be precise; the ctx cancellation means something clear that people understand, the "policy" seems nebulous
Removed.
docs/tech-notes/contexts.md, line 133 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
you mean
context.Background()?
Removed.
docs/tech-notes/contexts.md, line 133 at r2 (raw file):
> uncanceledCtx = logtags.WithTags(ctx, logtags.FromContext(ctx))
This mistakenly removes other context.Values we might care about. I think we need an AmbientContext in any case.
But I agree this has nothing to do with cancellation, so it's best to keep that bit silent.
docs/tech-notes/contexts.md, line 139 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
In any case, it is very important to use ambient contexts as much as
possible, to ensure that the common information is effectivelyWhat is this "effectively shared" referring two? The sharring of a
logtags.Buffer? I wouldn't particularly talk about such implementation details.
I also wouldn't use big words like "very important" :) . I think we should tell people thatAmbientContextsare a popular way of getting log tags, and that the node id tag in particular is important in tests (and beyond, due tocockroach debug merge-logs), and leave it at that.If I were to nitpick further, this paragraph also seems out of place to me. This section is supposed to be all terminating cancellation, right? And this paragraph is about the importance of log tags. But log tags are generally discussed elsewhere.
Removed references to cancellation. Added some words to suggest that the AmbientContext is responsible for more than just logging tags.
docs/tech-notes/contexts.md, line 392 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
I think perhaps there's some confusion here:
AmbientCtxis not particularly aboutn async work of "spawned goroutines". If anything, it is more about synchronous operations flowing through a component (for example, a store). The store wants to add the storeid to anything passing through the store, not just to goroutines spawned by the store.
Understood. Removed the reference to "spawned".
docs/tech-notes/contexts.md, line 535 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
s/nil/nil /*value */ - to give a hint to people what that is
Done.
docs/tech-notes/contexts.md, line 662 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
I personally wouldn't talk about the
Tracerhere.AnnotateCtxWithSpan()is used much less thenAnnotateCtx, probably for good reason. So, at the very least, don't put the reference to the tracer first, and qualify it to be clear that the "association with the Tracer" only happens if you ask for aSpan. I would also just talk about spans, and not about theTracer.
Understood. Changed the verbiage to refer to "annotations" and let the responsibility of AmbientContext opaque in this paragraph, which doesn't need details.
docs/tech-notes/contexts.md, line 713 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
well, if you combine "logs/traces" like you do in this column, then the answer to
(3) sub-span-Identifies the current operation in logs/tracesshould beyes. Creating a span is the biggest way to identify an operation in a trace.
Ah, right. Updated.
docs/tech-notes/contexts.md, line 729 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
nit: I wouldn't call this "an exception"; it's consistent with the previous paragraph.
Good point. Changed.
docs/tech-notes/contexts.md, line 758 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
what is the shortcut for
1+2that you're referring to?
That was poor phrasing. I wanted to highlight it's a common pattern in the code. Rephrased.
Release note: None
8e3d534 to
058ab55
Compare
andreimatei
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @knz, @nvanbenschoten, and @tbg)
docs/tech-notes/contexts.md, line 667 at r4 (raw file):
some notes about this section that you seemed to vibe with on slack
So, the way I see it, one should think about the AmbientCtx as a simple holder of some static tags. If you want to put those tags in a ctx, you can use. When do you want to add static tags to a context (as opposed to dynamic tags). I think in two situations:
- At the entry points of some select modules (server,node,replica,queue), you want every operation that passes through your component to get the component's static tag.
- When you're spawning off background work (work that's not tied to any other operation / ctx; usually something done periodically, on a timer), then normally you'd use context.Background(). But instead of context.Background(), you should do everybody a favor and make sure that the server id (node tag at least) is in there. So, AmbientCtx can do that for you.
|
nb: I plan to revisit after the dust settles on #72912 |
No description provided.