fix: delay sampling of tracing spans to finish#712
Merged
Swatinem merged 1 commit intogetsentry:masterfrom Nov 29, 2024
Merged
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #712 +/- ##
==========================================
+ Coverage 72.79% 73.21% +0.42%
==========================================
Files 66 66
Lines 7804 7923 +119
==========================================
+ Hits 5681 5801 +120
+ Misses 2123 2122 -1 |
9d0f853 to
230d6d8
Compare
Swatinem
approved these changes
Nov 29, 2024
github-merge-queue bot
pushed a commit
to firezone/firezone
that referenced
this pull request
Dec 2, 2024
This is another attempt at fixing #7386. Previous PR was #7379. The difference is, this time it works! In the following screenshot, `handle_input` is a currently active span.  I had to make some patches to Sentry, most notably: - getsentry/sentry-rust#708 - getsentry/sentry-rust#712 The way we configure Sentry is quite tricky: First and foremost, we need to understand that the `tracing` adapter for Sentry has a `span_filter` configuration. When a span gets filtered out there, the rest of `sentry-tracing` never sees the data in that span. Thus, in order to capture variables from spans, we need to have a fairly generous span filter. In this PR, we change this span filter to include all spans except those on TRACE level. Secondly, by default, the Sentry SDK doesn't send any spans to the backend, i.e. the sampling rate is 0. Previously, we set the sampling rate to 1.0 because the `span_filter` was already filtering out all non-telemetry spans. A telemetry span is a concept that we invented. It is a span that gets sampled at _creation_ time with a probability of 1%. This is useful because creating a lot of spans is also expensive, so we don't want to do it e.g. on a per-packet basis. With just these configuration options, we now have a problem: We don't want to submit all spans to Sentry but we need the `span_filter` to allow all spans otherwise we can't capture the contextual fields from the span in breadcrumbs. Luckily, the Sentry SDK has another configuration option: `traces_sampler`. The `traces_sampler` gets to compute a sampling rate for each individual span. This allows us to discard all spans from being sent to Sentry unless they are `telemetry` spans. Resolves: #7386.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR is an attempt to completely resolve #617.
Currently, fields from spans are only recorded when the span gets sampled. This currently happens very early, right when the
Transactiongets created. As a result, any event that gets recorded afterwards (i.e. as a breadcrumb) does not have access to the current span fields.By delaying the discarding of
Transactiontofinish, we get to retain all the contextual data and still honor the configured sampling.I ran a performance test of our software with this local branch applied and could not measure any impact of this.