Initial integration of W3C protocol for incoming request#776
Conversation
littleaj
left a comment
There was a problem hiding this comment.
Overall this looks great. I posted a few questions/comments.
This would be good to have a smoke test.
|
|
||
| try { | ||
| if (request == null) { | ||
| InternalLogger.INSTANCE.error("Failed to resolve correlation. request is null."); |
There was a problem hiding this comment.
I'm on the fence if these should be error or warning. Since correllation is not necessarliy in every request, I'm leaning towards warning.
There was a problem hiding this comment.
I am not fully convinced. I think if the websdk is enabled than correlation should be there by default in every request. Further, if request object is somehow null I think it's justified to have error level log.
| */ | ||
| private static void addTracestateInResponseHeader(HttpServletResponse response) { | ||
|
|
||
| if (response.containsHeader(CORRELATION_COMPANION_HEADER_NAME)) { |
There was a problem hiding this comment.
What's the rationale here? Are we opting not to overwrite if something else added the correlation header?
If that's the case, we may want to log something or it could be difficult to determine why correlation isn't working when this evals to true.
There was a problem hiding this comment.
That is correct. However looking here, I don't see any case when this would happen. As this is the only method which sets the Header in the response. I think it is safe to remove.
| public static String retriveTracestate() { | ||
| //check if context is null - no correlation will happen | ||
| if (ThreadContext.getRequestTelemetryContext() == null) { | ||
| InternalLogger.INSTANCE.warn("No correlation wil happen, Thread context is null"); |
There was a problem hiding this comment.
typo: "wil" -> "will"
Also, shouldn't it read "RequestTelemetryContext" instead of "Thread context"?
There was a problem hiding this comment.
so the ThreadContext is the wrapper class which stores the RequestTelemetry in InherritableThreadLocal. When the threads change we want to retrive the context from ThreadLocal.
There was a problem hiding this comment.
I know, but the ThreadContext (call the static instance T) provides the RequestTelemetryContext (call the instance R)... T.get -> R and this checks if R is null but the messages says T is null.
| HttpServletResponse response = (HttpServletResponse) res; | ||
| TelemetryCorrelationUtils.resolveCorrelation(request, response, telemetry); | ||
| if (isW3CEnabled) { | ||
| TraceContextCorrelation.resolveCorrelation(request, response, telemetry); |
There was a problem hiding this comment.
we should be backward compatible and if w3c headers are not present, check for legacy headers
| outGoingTraceParent = new Traceparent(); | ||
|
|
||
| // represents the id of the current request. | ||
| requestTelemetry.setId(outGoingTraceParent.getTraceId() + "-" + outGoingTraceParent.getSpanId()); |
There was a problem hiding this comment.
|outGoingTraceParent.getTraceId().outGoingTraceParent.getSpanId().
There was a problem hiding this comment.
I know these are backward compatibility comments. Understand it's importance. The question I have is, should we make changes in UX to understand correct format too or should SDK set it as UX expects?
My understanding is that the protocol is what defines the contract between UX and SDK.
There was a problem hiding this comment.
if we want to make it work right away - do |outGoingTraceParent.getTraceId().outGoingTraceParent.getSpanId().
there is no UX work planned now to support another contract
| null, incomingTraceparent.getTraceFlags()); | ||
|
|
||
| // set id of this request | ||
| requestTelemetry.setId(outGoingTraceParent.getTraceId() + "-" + outGoingTraceParent.getSpanId()); |
There was a problem hiding this comment.
|outGoingTraceParent.getTraceId().outGoingTraceParent.getSpanId().
| requestTelemetry.getContext().getOperation().setId(outGoingTraceParent.getTraceId()); | ||
|
|
||
| // represents the parent-id of this request which is combination of traceId and incoming spanId | ||
| requestTelemetry.getContext().getOperation().setParentId(outGoingTraceParent.getTraceId() + "-" + |
There was a problem hiding this comment.
|outGoingTraceParent.getTraceId().incomingTraceparent.getSpanId().
| ThreadContext.getRequestTelemetryContext().setTracestate(outboundTracestate); | ||
|
|
||
| // Let the callee know the caller's AppId | ||
| addTracestateInResponseHeader(response); |
There was a problem hiding this comment.
we should not set tracestate in response header
There was a problem hiding this comment.
@lmolkova, what should we pass in the response header to achieve the same functionality? Should we just use the previous header? And why not tracestate?
There was a problem hiding this comment.
W3C does not define response headers. we should continue responding with Request-Context to everything
There was a problem hiding this comment.
That makes sense. I will update this one. The question comes in mind then what happens when one wants to use OpenCensus SDKs and just change the end-point to go to AI endpoint. This wouldn't work definitely. Do we want to introduce this legacy in the brand new implementation.
I am aware .NET has done this! @reyang what are your thoughts here?
PS : The work is minimal, but it's like once we ship something it's difficult to revert back.
| // set parentId as null because this is the is the originating request | ||
| requestTelemetry.getContext().getOperation().setParentId(null); | ||
| } else { | ||
| Traceparent incomingTraceparent = Traceparent.fromString(traceId); |
There was a problem hiding this comment.
depending on the way you chose to validate - you might need to check incomingTraceparent for null and probably catch exceptions.
If you follow comments I've left in the previous PR (never throw, return brand new traceid and spanid on the invalid parent), this is the right usage, but Traceparent.fromString needs to be fixed to use it like that
There was a problem hiding this comment.
Yes, this makes sense. incomingTraceparent can be null and in that case we need to create new TraceId and SpanId. I think it's place resides in Traceparent.fromString(). As I said, there are tests I need to implement :)
There was a problem hiding this comment.
- RequestTelemtery.Id and ParentId format need to be set in the same way as .NET does it
- Backward compatibility: if w3c is configured, but there are not w3c headers, we still need to check for legacy ones (this could be done as separate PR)
- do not return tracestate in response
LocalForwarder does. |
Ok, however I don't think most customers even use Local Forwarder as of today (based on telemetry). The direct channel should be as compatible as Local Forwarder. We can do whatever we want to do and support the UX as of now. However my $0.02 would be, please lets release this as experimental work. UX needs to adhere to the common protocol. That is how essentially systems work. In the final version I would like to see a clean implementation of this protocol which can be treated as a reference. |
|
@reyang @lmolkova @littleaj I have implemented most unit tests to cover W3C integration for Incoming request. I have marked some TODO's where I have questions on what should be desired behavior. |
Initial implementation for integration of W3C protocol for Incoming Request.
Related #775 #716
Please note this is not a complete PR yet. There are some tests which are missing and I will add it. Just pushing the code so that, people can take a look and start commenting on approach.
@lmolkova @reyang please review.
For significant contributions please make sure you have completed the following items: