Traces
Distributed tracing with span trees, sampling, and cross-service trace propagation.
This document covers how SDKs should add support for Performance Monitoring with Distributed Tracing.
This should give an overview of the APIs that SDKs need to implement, without mandating internal implementation details.
Reference implementations:
This document uses standard interval notation, where [ and ] indicates closed intervals, which include the endpoints of the interval, while ( and ) indicates open intervals, which exclude the endpoints of the interval. An interval [x, y) covers all values starting from x up to but excluding y.
This section describes the options SDKs should expose to configure tracing and performance monitoring.
Tracing is enabled by setting either a tracesSampleRate or tracesSampler. If not set, these options default to undefined or null, making tracing opt-in.
This option is deprecated and should be removed from all SDKs.
This should be a floating-point number in the range [0, 1] and represents the percentage chance that any given transaction will be sent to Sentry. So, barring outside influence, 0.0 is a guaranteed 0% chance (none will be sent) and 1.0 is a guaranteed 100% chance (all will be sent). This rate applies equally to all transactions; in other words, each transaction has an equal chance of being marked as sampled = true, based on the tracesSampleRate.
See more about how sampling should be performed below.
This should be a callback function, triggered when a transaction is started. It should be given a samplingContext object and should return a sample rate in the range of [0, 1] for the transaction in question. This sample rate should behave the same way as the tracesSampleRate above. The only difference is that it only applies to the newly-created transaction and that different transactions can be sampled at different rates. Returning 0.0 should force the transaction to be dropped (set to sampled = false) and returning 1.0 should force the transaction to be sent (set to sampled = true).
Historically, the tracesSampler callback could have also returned a boolean to force a sampling decision (with false equivalent to 0.0 and true equivalent to 1.0). This behavior is now deprecated and should be removed from all SDKs.
See more about how sampling should be performed below.
See Trace Propagation: tracePropagationTargets.
See Trace Propagation: strictTraceContinuation.
This should be a boolean value. Default is false. When set to true transactions should be created for HTTP OPTIONS requests. When set to false NO transactions should be created for HTTP OPTIONS requests. This configuration is most valuable on backend server SDKs. If this configuration does not make sense for an SDK it can be omitted.
Because transaction payloads have a maximum size enforced on the ingestion side, SDKs should limit the number of spans that are attached to a transaction. This is similar to how breadcrumbs and other arbitrarily sized lists are limited to prevent accidental misuse. If new spans are added once the maximum is reached, the SDK should drop the spans and ideally use the internal logging to help debugging.
The maxSpans should be implemented as an internal, non-configurable, constant that defaults to 1000. It may become configurable if there is justification for that in a given platform.
The maxSpans limit may also help avoiding transactions that never finish (in platforms that keep a transaction open for as long as spans are open), preventing OOM errors, and generally avoiding degraded application performance.
See Trace Propagation: propagateTraceparent.
This SHOULD be a collection of integers, denoting HTTP status codes. If suitable for the platform, the collection MAY also admit pairs of integers, denoting inclusive HTTP status code ranges.
The option applies exclusively to incoming requests, and therefore MUST only be implemented in server SDKs.
The SDK MUST honor this option by inspecting the http.response.status_code attribute on each transaction/root span before it's finished. If the value of this attribute matches one of the status codes in traceIgnoreStatusCodes, the SDK MUST set the transaction's sampling decision to not sampled.
Note that a prerequisite to implement this option is that every HTTP server integration MUST record the http.response.status_code attribute as defined in the OTEL spec.
The SDK MUST emit a debug log denoting why the transaction was dropped. If the SDK implements client reports, it MUST record the dropped transaction with the event_processor discard reason.
This option MUST default to an empty collection if it's introduced in a release with a minor SemVer bump. SDKs SHOULD set the default for this option to the following value (or equivalent if the implementation doesn't admit pairs of integers)
[[301, 303], [305, 399], [401, 404]]
at the earliest release with a major SemVer bump following its introduction.
The rationale for this option and default is to not consume a user's span quota to trace requests that are useless for debugging purposes (and can often be triggered by scanning bots).
Examples: [403, 404]: don't sample transactions corresponding to requests with status code 403 or 404 [[300, 399], [401, 404]]: don't sample transactions corresponding to requests with status codes between 300 and 399 (inclusive) or between 401 and 404 (inclusive)
This MUST be a boolean value that defaults to true.
The option controls if attributes with code source information are set on database query spans when the query duration exceeds a given threshold.
The following attributes, or a subset thereof, SHOULD be set on database query spans if the threshold is exceeded. Values for some of the attributes below may be unavailable in some situations for the SDK, and in these cases a subset MAY be provided.
The attributes are described in Sentry's Span Convention Documentation.
- code.file.path
- code.function.name
- code.line.number
A threshold duration, which MUST be a floating point or integer value. The value specifies, in milliseconds, the duration of a database query before code source information is added.
The default value is platform-dependent and SHOULD balance the overhead of adding the information with its utility for queries that exceed the threshold duration for users of the SDK. In Python and PHP Laravel, the default threshold is 100 milliseconds.
This MUST be a boolean value that defaults to true.
The option controls if attributes with code source information are set on outgoing HTTP requests. When enabled, the attributes SHOULD be attached only when the time to receive the response from sending the request exceeds a given threshold.
The following attributes, or a subset thereof, SHOULD be set if the threshold is exceeded. Values for some of the attributes below may be unavailable in some situations for the SDK, and in these cases a subset MAY be provided.
The attributes are described in Sentry's Span Convention Documentation.
- code.file.path
- code.function.name
- code.line.number
A threshold duration, which MUST be a floating point or integer value. The value specifies, in milliseconds, the time between sending an HTTP request and receiving its response, after which code source information is added.
The default value is platform-dependent and SHOULD balance the overhead of adding the information with its utility for request-response cycles that exceed the threshold duration for users of the SDK.
As of writing, transactions are implemented as an extension of the Event model.
The distinctive feature of a Transaction is type: "transaction".
Apart from that, the Event gets new fields: spans, contexts.TraceContext.
In memory, spans build up a conceptual tree of timed operations. We call the whole span tree a transaction. Sometimes we use the term "transaction" to refer to a span tree as a whole tree, sometimes to refer specifically to the root span of the tree.
Over the wire, transactions are serialized to JSON as an augmented Event, and sent as envelopes. The different envelope types are for optimizing ingestion (so we can route "transaction events" differently than other events, mostly "error events").
In the Sentry UI, you can use Discover to look at all events regardless of type, and the Issues and Performance sections to dive into errors and transactions, respectively. The user-facing tracing documentation explains more of the concepts on the product level.
The Span class stores each individual span in a trace.
The Transaction class is like a span, with a few key differences:
- Transactions have
name, spans don't. - Transactions must specify the source of its
nameto indicate how the transaction name was generated. - Calling the
finishmethod on spans record the span's end timestamp. For transactions, thefinishmethod additionally sends an event to Sentry.
The Transaction class may inherit from Span, but that's an implementation detail. Semantically, transactions represent both the top-level span of a span tree as well as the unit of reporting to Sentry.
SpanInterface- When a
Spanis created, set thestartTimestampto the current time SpanContextis the attribute collection for aSpan(Can be an implementation detail). When possibleSpanContextshould be immutable.Spanshould have a methodstartChildwhich creates a new span with the current span's id as the new span'sparentSpanIdand the current span'ssampledvalue copied over to the new span'ssampledproperty- The
startChildmethod should respect themaxSpanslimit, and once the limit is reached the SDK should not create new child spans for the given transaction. Spanshould have a method calledtoSentryTracewhich returns a string that could be sent as a header calledsentry-trace.Spanshould have a method callediterHeaders(adapt to platform's naming conventions) that returns an iterable or map of header names and values. This is a thin wrapper containingreturn {"sentry-trace": toSentryTrace()}right now. SeecontinueFromHeadersas to why this exists and should be preferred when writing integrations.
- When a
TransactionInterface- A
Transactioninternally holds a flat list of child Spans (not a tree structure) Transactionhas additionally asetNamemethod that sets the name of the transactionTransactionreceives aTransactionContexton creation (new property vs.SpanContextisname)- Since a
Transactioninherits aSpanit has all functions available and can be interacted with like it was aSpan - A transaction is either sampled (
sampled = true) or unsampled (sampled = false), a decision which is either inherited or set once during the transaction's lifetime, and in either case is propagated to all children. Unsampled transactions should not be sent to Sentry. TransactionContextshould have a static/ctor method calledfromSentryTracewhich prefills aTransactionContextwith data received from asentry-traceheader valueTransactionContextshould have a static/ctor method calledcontinueFromHeaders(headerMap)which is really just a thin wrapper aroundfromSentryTrace(headerMap.get("sentry-trace"))right now. This should be preferred by integration/framework-sdk authors overfromSentryTraceas it hides the exact header names used deeper in the core sdk, and leaves opportunity for using additional headers (from the W3C) in the future without changing all integrations.
- A
Span.finish()- Accepts an optional
endTimestampto allow users to set a customendTimestampon the finished span - If an
endTimestampvalue is not provided, setendTimestampto the current time (in payloadtimestamp)
- Accepts an optional
Transaction.finish()super.finish()(call finish on Span)- Send it to Sentry only if
sampled == true - Like spans, can be given an optional
endTimestampvalue that should be passed into thespan.finish()call - A
Transactionneeds to be wrapped in anEnvelopeand sent to the Envelope Endpoint - The
Transportshould use the same internal queue forTransactions/Events - The
Transportshould implement category-based rate limiting → - The
Transportshould deal with wrapping aTransactionin anEnvelopeinternally
Each transaction has a sampling decision, that is, a boolean which declares whether or not it should be sent to Sentry. This should be set exactly once during a transaction's lifetime, and should be stored in an internal sampled boolean.
There are multiple ways a transaction can end up with a sampling decision:
- Random sampling according to a static sample rate set in
tracesSampleRate - Random sampling according to a dynamic sample rate returned by
tracesSampler - Absolute decision (100% chance or 0% chance) returned by
tracesSampler - If the transaction has a parent, inheriting its parent's sampling decision
- Absolute decision passed to
startTransaction
If more than one option could apply, the following rules determine which takes precedence:
- If a sampling decision is passed to
startTransaction(startTransaction({name: "my transaction", sampled: true})), that decision will be used, regardless of anything else - If
tracesSampleris defined, its decision will be used. It can choose to keep or ignore any parent sampling decision, or use the sampling context data to make its own decision or choose a sample rate for the transaction. - If
tracesSampleris not defined, but there's a parent sampling decision, the parent sampling decision will be used. - If
tracesSampleris not defined and there's no parent sampling decision,tracesSampleRatewill be used.
Note
Transactions should be sampled only by tracesSampleRate or tracesSampler. The sampleRate configuration is used for error events and should not apply to transactions.
If defined, the tracesSampler callback should be passed a samplingContext object, which should include, at minimum:
- The
transactionContextwith which the transaction was created - A float/double
parentSampleRatewhich contains the sampling rate passed down from the parent - A boolean
parentSampledwhich contains the sampling decision passed down from the parent, if any - Data from an optional
customSamplingContextobject passed tostartTransactionwhen it is called manually
Depending on the platform, other default data may be included. (For example, for server frameworks, it makes sense to include the request object corresponding to the request the transaction is measuring.)
See Trace Propagation: Sampling Decision Propagation and Propagated Random Value.
If the SDK supports backpressure handling, the overall sampling rate needs to be divided by the downsamplingFactor from the backpressure monitor. See the backpressure spec for more details.
See Trace Propagation: sentry-trace Header for the full header format and sampling value specification.
The Sentry.startTransaction function should take two arguments - the transactionContext passed to the Transaction constructor and an optional customSamplingContext object containing data to be passed to tracesSampler (if defined). It creates a Transaction bound to the current hub and returns the instance. Users interact with the instance for creating child spans and, thus, have to keep track of it themselves.
With Sentry.span users can attach spans to an already ongoing transaction. This property returns a SpanProtocol if a running transaction is bound to the scope; otherwise, it returns nil. Although we recommend users keep track of their own transactions, the SDKs should offer a way to expose auto-generated transactions. SDKs shall bind auto-generated transactions to the scope, making them accessible with Sentry.span. If the SDK has global mode enabled, which specifies whether to use global scope management mode and should be true for client applications and false for server applications, Sentry.span shall return the active transaction. If the user disables global mode, Sentry.span shall return the latest active (unfinished) span.
Introduce a method called
traceHeaders- This function returns a header (string)
sentry-trace - The value should be the trace header string of the
Spanthat is currently on theScope
- This function returns a header (string)
Introduce a method called
startTransaction- Takes the same two arguments as
Sentry.startTransaction - Creates a new
Transactioninstance - Should implement sampling as described in more detail in the 'Sampling' section of this document
- Takes the same two arguments as
Modify the method called
captureEventorcaptureTransaction- Don't set
lastEventIdfor transactions
- Don't set
The Scope holds a reference to the current Span or Transaction.
ScopeIntroducesetSpan- This can be used internally to pass a
Span/Transactionaround so that integrations can attach children to it - Setting the
transactionproperty on theScope(legacy) should overwrite the name of theTransactionstored in theScope, if there is one. With that we give users the option to change the transaction name even if they don't have access to the instance of theTransactiondirectly.
- This can be used internally to pass a
The beforeSend callback is a special Event Processor that we consider to be of most prominent use. Proper Event Processors are often considered internal.
Transactions should not go through beforeSend. However, they are still processed by Event Processors. This is a compromise between some flexibility in dealing with the current implementation of transactions as events, and leaving room for different lifetime hooks for transactions and spans.
Motivations:
Future-proofing: if users rely on
beforeSendfor transactions, that would complicate eventually implementing individual span ingestion without breaking user code. As of writing, a transaction is sent as an event, but that is considered an implementation detail.API compatibility: users have their existing implementation of
beforeSendthat only ever had to deal with error events. We introduced transactions as a new type of event. As users upgrade to a new SDK version and start using tracing, theirbeforeSendwould start seeing a new type that their code was not meant to handle. Before transactions, they didn't have to care about different event types at all. There are several possible consequences: breaking user apps; silently and unintentionally dropping transactions; transaction events modified in surprising ways.In terms of usability,
beforeSendis not a perfect fit for dropping transactions like it is for dropping errors. Errors are a point-in-time event. When errors happen, users have full context inbeforeSendand can modify/drop the event before it goes to Sentry. With transactions the flow is different. Transactions are created and then they are open for some time while child spans are created and appended to it. Meanwhile outgoing HTTP requests include the sampling decision of the current transaction with other services. After spans and the transaction are finished, dropping the transaction in abeforeSend-like hook would leave orphan transactions from other services in a trace. Similarly, modifying the sampling decision to "yes" at this late stage would also produce inconsistent traces.
Our documentation is open source and available on GitHub. Your contributions are welcome, whether fixing a typo (drat!) or suggesting an update ("yeah, this would be better").