Skip to content

tracing: Add trace sampling configuration to the route, to override the route level#6986

Merged
mattklein123 merged 9 commits intoenvoyproxy:masterfrom
objectiser:routetraceconfig
May 28, 2019
Merged

tracing: Add trace sampling configuration to the route, to override the route level#6986
mattklein123 merged 9 commits intoenvoyproxy:masterfrom
objectiser:routetraceconfig

Conversation

@objectiser
Copy link
Copy Markdown
Contributor

@objectiser objectiser commented May 17, 2019

Description: Enable trace sampling to be overridden at the route level.

Risk Level: low
Testing: added unit tests and manually updated examples/jaeger-tracing to try it out.
Docs Changes: Assume model docs are autogenerated. Not sure if any other docs need to be changed.
Release Notes:
Sampling associated with individual routes can now be overridden to define route specific values.

Fixes #6915

@objectiser objectiser force-pushed the routetraceconfig branch 2 times, most recently from 4cb46dd to 2f4e82b Compare May 17, 2019 18:02
@dio
Copy link
Copy Markdown
Member

dio commented May 18, 2019

/retest

@repokitteh-read-only
Copy link
Copy Markdown

🔨 rebuilding ci/circleci: coverage (failed build)

🐱

Caused by: a #6986 (comment) was created by @dio.

see: more, trace.

@objectiser
Copy link
Copy Markdown
Contributor Author

@dio Coverage now fixed.

@objectiser
Copy link
Copy Markdown
Contributor Author

@dio @mattklein123 Is this ok to merge?

@mattklein123 mattklein123 self-assigned this May 22, 2019
@mattklein123
Copy link
Copy Markdown
Member

@dio can you take a first pass?

@dio dio changed the title tracing: Add trace sampling configuration to the route, to override t… tracing: Add trace sampling configuration to the route, to override the route level May 22, 2019
Copy link
Copy Markdown
Member

@dio dio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, I can see this is following the pattern of the decorator. Just one question and do you mind to add the release note to the version_history as an update?

@objectiser
Copy link
Copy Markdown
Contributor Author

@dio Added version_history entry.

@mattklein123
Copy link
Copy Markdown
Member

@objectiser can you fix format and I will take a look?

/wait

Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks generally looks great. Flushing out a few comments.

/wait

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we have an opportunity here to "redo" the API can we use FractionalPercent? It performs better and also allows for smaller percentages. Also, I think having the defaults be 100% was a mistake in the original tracing code. I guess we should be consistent, but do you think we should consider changing them to 0 potentially? WDYT?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to FractionalPercent.

Regarding changing the sampling - this was done in Istio a while ago, from 100% to 1%, but users still raise issues regularly about not being able to see tracing data - and have to be directed to the FAQ on setting the sampling rate. So there are pros and cons with any setting - but possibly 0 is good for envoy as it essentially means tracing is disabled by default and has to be explicitly configured with a suitable sampling rate.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per above these should return fractional percent objects.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry why is this new check necessary? Can you add a comment?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure can add a comment - essentially previously the call to mutateTracingRequestHeader was performed inside mutateRequestHeaders which was subject to the same condition just above. So thought, even though the call has been pulled out from mutateRequestHeaders, it should still be subject to the same condition? Let me know if not correct.

Is this condition only true for the initial processing of a request at the first proxy? In which case this would prevent the request id being modified mid way through a mesh I assume.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: route_tracing_

…he settings defined on the filter

Signed-off-by: Gary Brown <gary@brownuk.com>
Signed-off-by: Gary Brown <gary@brownuk.com>
Signed-off-by: Gary Brown <gary@brownuk.com>
Signed-off-by: Gary Brown <gary@brownuk.com>
Signed-off-by: Gary Brown <gary@brownuk.com>
Signed-off-by: Gary Brown <gary@brownuk.com>
Signed-off-by: Gary Brown <gary@brownuk.com>
Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks great. A few small comments.

/wait

return;
}

envoy::type::FractionalPercent client_sampling = config.tracingConfig()->client_sampling_;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use a const pointer for these variables to avoid the copy?


const std::string& DecoratorImpl::getOperation() const { return operation_; }

RouteTracingImpl::RouteTracingImpl(const envoy::api::v2::route::Tracing& tracing)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you are defaulting these to 100%? Can you double check this and add tests if that is what we want to do for defaults? (Or change the docs).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on your previous comment, I was wondering whether a default of 0% would be appropriate here, as they are specific to the route - however the overall_sampling default of 0% does not seem right.

So think it may be better to use 100% for consistency now, and then if the decision is made to change to 0% at a later date, that can be done at both levels.

… change route tracing sampling defaults to 100%

Signed-off-by: Gary Brown <gary@brownuk.com>
Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks great. 2 small comments. @dio any further comments?

/wait

}
if (!tracing.has_random_sampling()) {
random_sampling_.set_numerator(10000);
random_sampling_.set_denominator(envoy::type::FractionalPercent::TEN_THOUSAND);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I don't think there is any reason to actually set the denominator here to 10,000, right? I think just making it 100/100 like the others is sufficient? Mostly just trying to avoid questions later about why this is 10,000 here.

client_sampling.set_numerator(
tracing_config.has_client_sampling() ? tracing_config.client_sampling().value() : 100);
envoy::type::FractionalPercent random_sampling;
random_sampling.set_numerator(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can you add a small TODO/comment here that random sampling historically was an integer and default to out of 10,000 but we should deprecate that and move to a straight fractional percent config? The way we have this now is pretty confusing for historical reasons.

@dio
Copy link
Copy Markdown
Member

dio commented May 27, 2019

This is nice. Thanks! @mattklein123 no more comments from me.

dio
dio previously approved these changes May 27, 2019
Copy link
Copy Markdown
Member

@dio dio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modulo Matt's.

Signed-off-by: Gary Brown <gary@brownuk.com>
@objectiser
Copy link
Copy Markdown
Contributor Author

@mattklein123 Done

Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

/retest

@repokitteh-read-only
Copy link
Copy Markdown

🤷‍♀️ nothing to rebuild.

🐱

Caused by: a #6986 (review) was submitted by @mattklein123.

see: more, trace.

@mattklein123 mattklein123 merged commit e353807 into envoyproxy:master May 28, 2019
@dio
Copy link
Copy Markdown
Member

dio commented May 28, 2019

/azp run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Overriding trace sampling configuration on a per route basis

3 participants