Wrap code in new tracing contexts where required#88920
Wrap code in new tracing contexts where required#88920pugnascotia merged 9 commits intoelastic:mainfrom
Conversation
Part of elastic#84369. Split out from elastic#88443. This PR wraps parts of the code either in a new tracing context. This is necessary so that a tracing implementation can use the thread context to propagate tracing headers, but without the code attempting to set the same key twice in the thread context, which is illegal. Note that in some places we actually clear the tracing context completely. This is done where the operation to be performed should have no association with the current trace context. For example, when creating a new index via a REST request, the resulting background tasks for the index should not be associated with the REST request in perpetuity.
|
Pinging @elastic/es-distributed (Team:Distributed) |
|
Pinging @elastic/es-core-infra (Team:Core/Infra) |
|
I'll hold on merging this until @original-brownbear has had a change to review. |
original-brownbear
left a comment
There was a problem hiding this comment.
Commented inline, we have to do this a little differently.
| public void onAfter() { | ||
| request.decRef(); | ||
| } | ||
| public void onFailure(Exception e) {} |
There was a problem hiding this comment.
This is not ok, the executor might directly call onFailure when it rejects the runnable because its queue is full. We have to go through the normal error handling mechanism here in this callback if we are using AbstractRunnable.
| public void onFailure(Exception e) { | ||
| handleSendToLocalException(channel, e, action); | ||
| } | ||
| public void onFailure(Exception e) {} |
There was a problem hiding this comment.
Same here, can't do this. We need to use the callback when using the exeuctor with an AbstractRunnable
|
|
||
| @Override | ||
| public void onAfter() { | ||
| request.decRef(); |
There was a problem hiding this comment.
Where did this go, regardless of the fact that we need to do this differently, it seems we lost the decRef here?
original-brownbear
left a comment
There was a problem hiding this comment.
I think I'd prefer less magic in terms of how we create the ContextPreservingAbstractRunnable and just do it outright as commented on inline. But other than that, this looks like it should work fine.
| /** | ||
| * Should the runnable start a new tracing context before it executes? | ||
| */ | ||
| public boolean useNewTraceContext() { |
There was a problem hiding this comment.
Maybe it makes more sense to just create a utility method on ContextPreservingAbstractRunnable that wraps an AbstractRunnable in a ContextPreservingAbstractRunnable with new trace context true flag set to true and use that?
That way we don't have to leak the high level tracing into low level AbstractRunnable. It's also probably less verbose than overriding to return true in multiple spots.
There was a problem hiding this comment.
I’m struggling to see how this would work. ContextPreservingAbstractRunnable is private and only created in ThreadContext.preserveContext(Runnable), but I need to communicate the need for a new trace context from where the runnable is created in the first place, for example in InboundHandler.
I thought about creating a named sub-class of AbstractRunnable e.g. TracingAbstractRunnable or something like that, but I couldn't see how that would avoid the need to changing AbstractRunnable, at least not without duplicating much of ContextPreservingAbstractRunnable.
I don't think there's a simpler answer for where to insert this logic, because there's a lot of parts that are collaborating here. In the case of InboundHandler, we see:
InboundHandlercreates anAbstractRunnableinstance and schedules it- The scheduler asks the thread context to wrap the runnable
- The thread context potentially wraps the runnable
- The scheduler executes the runnable
AbstractRunnable#runorchestratesdoRun,onFailureetc
We could change the visibility of ContextPreservingAbstractRunnable maybe, and sometimes create one explicitly? e.g. ContextPreservingAbstractRunnable.preservingContext(AbstractRunnable)?
There was a problem hiding this comment.
I think you could just add a method to ThreadContext similar to org.elasticsearch.common.util.concurrent.ThreadContext#preserveContext that takes an AbstractRunnable and does the wrapping for you without even having to expose ContextPreservingAbstractRunnable?
You have the thread context available via the threadPool and the threadpool will just pass through a ContextPreservingAbstractRunnable unchanged when executing => less code in the callsites and no need to change anything about AbstractRunnable right?
There was a problem hiding this comment.
@original-brownbear ah, you're right. I had another go, please take another look.
| public void onResponse(NodesStatsResponse nodesStatsResponse) { | ||
| logger.trace("received node stats response"); | ||
| try (var ignored = threadPool.getThreadContext().clearTraceContext()) { | ||
| client.admin().cluster().nodesStats(nodesStatsRequest, ActionListener.runAfter(new ActionListener<>() { |
There was a problem hiding this comment.
This change is fairly extreme in terms of how noisy it is line wise.
Maybe, also in the sense of future readability of this code, could we extract what's inside the try-with-resources into a new method here and in the other spots that got indented so extremely now? That way the change-set stays small because the code doesn't move in terms of indent level and things don't get much more complicated.
There was a problem hiding this comment.
I refactored most of these sites with the extra wrapping to extract methods.
original-brownbear
left a comment
There was a problem hiding this comment.
LGTM thanks for the extra iterations Rory!
Part of #84369. Split out from #88443. This PR wraps parts of the code
in a new tracing context. This is necessary so that a tracing
implementation can use the thread context to propagate tracing headers,
but without the code attempting to set the same key twice in the thread
context, which is illegal.
Note that in some places we actually clear the tracing context
completely. This is done where the operation to be performed should have
no association with the current trace context. For example, when
creating a new index via a REST request, the resulting background tasks
for the index should not be associated with the REST request in
perpetuity.