tracing: Add trace sampling configuration to the route, to override the route level#6986
Conversation
4cb46dd to
2f4e82b
Compare
|
/retest |
|
🔨 rebuilding |
|
@dio Coverage now fixed. |
|
@dio @mattklein123 Is this ok to merge? |
|
@dio can you take a first pass? |
dio
left a comment
There was a problem hiding this comment.
Looks good overall, I can see this is following the pattern of the decorator. Just one question and do you mind to add the release note to the version_history as an update?
1be06e8 to
1ca93dd
Compare
|
@dio Added version_history entry. |
|
@objectiser can you fix format and I will take a look? /wait |
mattklein123
left a comment
There was a problem hiding this comment.
Thanks generally looks great. Flushing out a few comments.
/wait
api/envoy/api/v2/route/route.proto
Outdated
There was a problem hiding this comment.
Since we have an opportunity here to "redo" the API can we use FractionalPercent? It performs better and also allows for smaller percentages. Also, I think having the defaults be 100% was a mistake in the original tracing code. I guess we should be consistent, but do you think we should consider changing them to 0 potentially? WDYT?
There was a problem hiding this comment.
+1 to FractionalPercent.
Regarding changing the sampling - this was done in Istio a while ago, from 100% to 1%, but users still raise issues regularly about not being able to see tracing data - and have to be directed to the FAQ on setting the sampling rate. So there are pros and cons with any setting - but possibly 0 is good for envoy as it essentially means tracing is disabled by default and has to be explicitly configured with a suitable sampling rate.
include/envoy/router/router.h
Outdated
There was a problem hiding this comment.
Per above these should return fractional percent objects.
There was a problem hiding this comment.
Sorry why is this new check necessary? Can you add a comment?
There was a problem hiding this comment.
Sure can add a comment - essentially previously the call to mutateTracingRequestHeader was performed inside mutateRequestHeaders which was subject to the same condition just above. So thought, even though the call has been pulled out from mutateRequestHeaders, it should still be subject to the same condition? Let me know if not correct.
Is this condition only true for the initial processing of a request at the first proxy? In which case this would prevent the request id being modified mid way through a mesh I assume.
source/common/router/config_impl.h
Outdated
…he settings defined on the filter Signed-off-by: Gary Brown <gary@brownuk.com>
Signed-off-by: Gary Brown <gary@brownuk.com>
Signed-off-by: Gary Brown <gary@brownuk.com>
Signed-off-by: Gary Brown <gary@brownuk.com>
Signed-off-by: Gary Brown <gary@brownuk.com>
Signed-off-by: Gary Brown <gary@brownuk.com>
d3ccb69 to
69a0441
Compare
mattklein123
left a comment
There was a problem hiding this comment.
Thanks, looks great. A few small comments.
/wait
| return; | ||
| } | ||
|
|
||
| envoy::type::FractionalPercent client_sampling = config.tracingConfig()->client_sampling_; |
There was a problem hiding this comment.
can we use a const pointer for these variables to avoid the copy?
source/common/router/config_impl.cc
Outdated
|
|
||
| const std::string& DecoratorImpl::getOperation() const { return operation_; } | ||
|
|
||
| RouteTracingImpl::RouteTracingImpl(const envoy::api::v2::route::Tracing& tracing) |
There was a problem hiding this comment.
I don't think you are defaulting these to 100%? Can you double check this and add tests if that is what we want to do for defaults? (Or change the docs).
There was a problem hiding this comment.
Based on your previous comment, I was wondering whether a default of 0% would be appropriate here, as they are specific to the route - however the overall_sampling default of 0% does not seem right.
So think it may be better to use 100% for consistency now, and then if the decision is made to change to 0% at a later date, that can be done at both levels.
… change route tracing sampling defaults to 100% Signed-off-by: Gary Brown <gary@brownuk.com>
mattklein123
left a comment
There was a problem hiding this comment.
Thanks, looks great. 2 small comments. @dio any further comments?
/wait
source/common/router/config_impl.cc
Outdated
| } | ||
| if (!tracing.has_random_sampling()) { | ||
| random_sampling_.set_numerator(10000); | ||
| random_sampling_.set_denominator(envoy::type::FractionalPercent::TEN_THOUSAND); |
There was a problem hiding this comment.
nit: I don't think there is any reason to actually set the denominator here to 10,000, right? I think just making it 100/100 like the others is sufficient? Mostly just trying to avoid questions later about why this is 10,000 here.
| client_sampling.set_numerator( | ||
| tracing_config.has_client_sampling() ? tracing_config.client_sampling().value() : 100); | ||
| envoy::type::FractionalPercent random_sampling; | ||
| random_sampling.set_numerator( |
There was a problem hiding this comment.
nit: Can you add a small TODO/comment here that random sampling historically was an integer and default to out of 10,000 but we should deprecate that and move to a straight fractional percent config? The way we have this now is pretty confusing for historical reasons.
|
This is nice. Thanks! @mattklein123 no more comments from me. |
Signed-off-by: Gary Brown <gary@brownuk.com>
|
@mattklein123 Done |
|
🤷♀️ nothing to rebuild. |
|
/azp run |
Description: Enable trace sampling to be overridden at the route level.
Risk Level: low
Testing: added unit tests and manually updated examples/jaeger-tracing to try it out.
Docs Changes: Assume model docs are autogenerated. Not sure if any other docs need to be changed.
Release Notes:
Sampling associated with individual routes can now be overridden to define route specific values.
Fixes #6915