Problem description
Instrumentations are encouraged to provide as many attributes as possible already when starting the span, to allow the sampler to make a better decision. However, collecting these attributes can already be expensive. There is one prominent case where samplers won't even look at these attributes, namely if the sampler respects parent decisions.
Proposal
I think the spec should contain a mechanism for instrumentations to ask the sampler earlier, before even starting to collect expensive attributes. Two ideas:
- Languages that use two-step span creation (via a SpanBuilder) could provide a "isRecordable" already on the SpanBuilder that calls the sampler with the information collected so far. A problem with this is that the span name may also be expensive to compute, but that is fixable by making it optional and providing a setName on the SpanBuilder.
- All languages could provide a isRecordable method on the tracer with some arguments, probably only the parent SpanContext.
Discussion
A difference to current calls to the sampler with both of these is that the new calls are (and should be!) made before a span ID is computed (whether the trace ID should be filled in from the parent context in cases where there is one is debatable). Nevertheless, I think the existing sampler interface is enough to support these use cases if samplers are written a bit defensively (the probability sampler which does look at the trace ID might be an exception, it would have to return true for invalid trace IDs since it cannot know yet).
EDIT: One issue with calling Sampler.shouldSample twice is that it may lose performance in the sampled case, especially if the sampler adds / computes attributes for its decision. This could be fixed by adding a new method to the sampler so the tracer knows any attributes won't be used. This can also be done backwards-compatible, as the method can be easily substituted with "return true" or "return shouldSample().isSampled()" for samplers that don't support it.
Another advantage of these methods would be that they could also support explicit tracing suppression (as per #530).
Problem description
Instrumentations are encouraged to provide as many attributes as possible already when starting the span, to allow the sampler to make a better decision. However, collecting these attributes can already be expensive. There is one prominent case where samplers won't even look at these attributes, namely if the sampler respects parent decisions.
Proposal
I think the spec should contain a mechanism for instrumentations to ask the sampler earlier, before even starting to collect expensive attributes. Two ideas:
Discussion
A difference to current calls to the sampler with both of these is that the new calls are (and should be!) made before a span ID is computed (whether the trace ID should be filled in from the parent context in cases where there is one is debatable). Nevertheless, I think the existing sampler interface is enough to support these use cases if samplers are written a bit defensively (the probability sampler which does look at the trace ID might be an exception, it would have to return true for invalid trace IDs since it cannot know yet).
EDIT: One issue with calling Sampler.shouldSample twice is that it may lose performance in the sampled case, especially if the sampler adds / computes attributes for its decision. This could be fixed by adding a new method to the sampler so the tracer knows any attributes won't be used. This can also be done backwards-compatible, as the method can be easily substituted with "return true" or "return shouldSample().isSampled()" for samplers that don't support it.
Another advantage of these methods would be that they could also support explicit tracing suppression (as per #530).