Conversation
For users who already use OpenTelemetry instrumentation or would prefer to use it instead of Sentry's own tracing API, provide a means to disable Sentry's automatic trace instrumentation and a span processor that exports OpenTelemetry spans to Sentry. Usage: ```ruby Sentry.init do |config| # ... # Enable trace collection config.traces_sample_rate = 1.0 # or use traces_sampler # Disable automatic span instrumentation config.auto_trace_instrumentation = false end OpenTelemetry::SDK.configure do |c| # ... c.add_span_processor(Sentry::OpenTelemetry::SpanProcessor.new) end ``` Setting `auto_instrument_traces` to `false` in `Sentry.init` stops Sentry from automatically creating any transactions or spans. (Tracing must still be enabled with `traces_sample_rate` or `traces_sampler`.) `Sentry::OpenTelemetry::SpanProcessor` is notified whenever an OTel span is started or stopped. In reaction, it starts or stops a Sentry transaction (if the span has no parent) or span. When the transaction is finished it is picked up and processed by Sentry's worker thread like any other transaction would be. If the child service uses the `Sentry::Rack::CaptureExceptions` middleware (see order caveat below), then its transactions will be linked inside Sentry with the parent service's transaction. Caveats: sentry-rails is not yet supported (haven't looked into it yet). `Sentry.init` must be run before `OpenTelemetry::SDK.configure` (Sentry's `Net::HTTP` monkey patch to add the `sentry-trace` header must run after OTel's monkey-patch that adds an HTTP span) `Sentry::Rack::CaptureExceptions` must be before `OpenTelemetry::Instrumentation::Rack::Middlewares::TracerMiddleware` in the middleware stack (the hub and scope must be set up before OTel creates any spans) Since Sentry spans are started and stopped in reaction to OTel spans starting and stopping, the durations reported might be slightly different. (This should be fixable by overriding the timestamps on the Sentry spans we create to match OTel's). In the case where an OTel root span has a child span that starts after the root span has finished, this will be represented in Sentry as two transactions (one for the original root span, and one for the late child).
|
Are span processors common in otel sdks? aka could we do this everywhere? |
|
It's a part of the core spec - https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/sdk.md#span-processor, and supported by every major SDK. |
This comment was marked as outdated.
This comment was marked as outdated.
|
ok I'm taking this up now so let's talk about this
Currently only net/http and redis span creation is disabled by this flag. What do we want from this flag? Do we want to turn off ALL instrumentation by sentry if otel is present? To prevent otel/sentry from stepping on each others' toes, we should just no-op most of the performance part of the SDK (except sampling) and that's a bit more work. If that's what we want, I can think of what the best way to do that is. |
|
Ideally we should turn off all automatic instrumentation that Sentry does with this flag - but still keep user's manual instrumentation. The reason I say this is because if we turn off all instrumentation, then we'll have API issues because we need to access Turning off all instrumentation is maybe cleaner though, because we can just instruct folks to switch to OpenTelemetry usage completely. Perhaps it's easier to start with turning off all automatic instrumentation? And then we can see how that looks/feels? |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #1876 +/- ##
==========================================
+ Coverage 98.33% 98.40% +0.07%
==========================================
Files 151 156 +5
Lines 9306 9679 +373
==========================================
+ Hits 9151 9525 +374
+ Misses 155 154 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
ea75bc7 to
781793b
Compare
|
ok this is backwards compat now with a new |
a2f8f8f to
0ba42c2
Compare
83f8efb to
7ec4748
Compare
|
@st0012, @AbhiPrasad this is very big now but it's ready to ship I think. We can fix small things and iterate on them later after shipping. Please let me know if you have questions! I will comment on the files themselves to explain some decisions for @st0012. I still wanna write some integration tests but you guys can start reviewing whenever you have time. Summary
Spec |
| TraceData = Struct.new(:trace_id, :span_id, :parent_span_id, :parent_sampled, :baggage) | ||
|
|
||
| class SpanProcessor < ::OpenTelemetry::SDK::Trace::SpanProcessor | ||
| include Singleton |
There was a problem hiding this comment.
I made it a singleton because span_map needs to be accessed in the propagator and I didn't want to add a global variable. This also makes it explicit that there is only one SpanProcessor globally in the whole system.
There was a problem hiding this comment.
By globally I assume you mean within the threads of the same process?
When reading the doc, I see
This means that the hub used in onStart is the same hub used in onEnd and it means that hubs fork properly in async contexts.
But in Ruby SDK, every thread would have their own Hub. So I'm not sure if a global SpanProcessor can achieve that when it'll probably always use the main hub.
There was a problem hiding this comment.
I think this part of the spec is outdated, since we no longer use scope to do anything, it shouldn't matter.
@AbhiPrasad
There was a problem hiding this comment.
Yup the spec is outdated, will fix in develop!
|
|
||
| sentry_parent_span = @span_map[trace_data.parent_span_id] if trace_data.parent_span_id | ||
|
|
||
| sentry_span = if sentry_parent_span |
There was a problem hiding this comment.
the root span from otel becomes a transaction and all other child spans become child spans of the transaction in sentry
| span_context = ::OpenTelemetry::Trace::SpanContext.new( | ||
| trace_id: [trace_id].pack('H*'), | ||
| span_id: [span_id].pack('H*'), | ||
| # we simulate a sampled trace on the otel side and leave the sampling to sentry |
There was a problem hiding this comment.
sampling the transaction is left to sentry, so we enforce sampling on the otel side, we might iterate on this later for better sampling compatibility with otel.
st0012
left a comment
There was a problem hiding this comment.
I'm not familiar with OpenTelemetry so I'm learning as I review this and have some questions:
- Is the goal to let users use opentelemetry-ruby but send the events to Sentry? If that's true, I assume we want them to opt-in to one side completely? e.g. Not tracing some parts with open-telemetry and the rest with Sentry's current tracing APIs/integrations.
- What concurrency model does
opentelemetry-rubysupport and does it fit with our a Hub/thread design? - The PR description says
... disable Sentry's automatic trace instrumentation and a span processor that exports OpenTelemetry spans to Sentry.- What'd happen if some Sentry's automatic traces slip in?
Suggestions:
-
We probably want to add something in
sentry-opentelemetryto enforce/warn users aboutThe SDK is initialized before the OpenTelemetry SDK is initialized.
-
I feel a helper like
Sentry.with_native_tracing?could be helpful. Like in sentry-rails's railties, we can just NOT register tracing subscribers if it'sfalse. Then we won't need to touch the subscribers' logic. -
Most of the changes in
sentry-rubycould be extracted to a separate PR.
| end | ||
|
|
||
| def with_child_span(**attributes, &block) | ||
| def with_child_span(instrumenter: :sentry, **attributes, &block) |
There was a problem hiding this comment.
Could users use both instrumenters (sentry and otel) at the same time? Or it's simpler like: if users chose otel, they'll give up sentry tracing completely.
There was a problem hiding this comment.
for now, it's one or the other. I don't know what direction we will take on this on the future, depends on a lot of parameters like adoption/future of SDKs etc
| TraceData = Struct.new(:trace_id, :span_id, :parent_span_id, :parent_sampled, :baggage) | ||
|
|
||
| class SpanProcessor < ::OpenTelemetry::SDK::Trace::SpanProcessor | ||
| include Singleton |
There was a problem hiding this comment.
By globally I assume you mean within the threads of the same process?
When reading the doc, I see
This means that the hub used in onStart is the same hub used in onEnd and it means that hubs fork properly in async contexts.
But in Ruby SDK, every thread would have their own Hub. So I'm not sure if a global SpanProcessor can achieve that when it'll probably always use the main hub.
Yes, exactly, we want to make it easier for users using otel to switch over to sentry for performance while continuing to use sentry for errors.
Yes, this is what the
The initial design was using scope but we removed that so our hub/scope management should not matter here anymore.
It's fine, not the worst. This is very much an MVP/experiment so we'll just fix it if someone reports a bug.
Apologies for the size, we were iterating on this a fair bit so I wasn't sure what shape it'd finally take. |
|
closing in favour of split PRs |
Closes #1907
For users who already use OpenTelemetry instrumentation or would prefer to use it instead of Sentry's own tracing API, provide a means to disable Sentry's automatic trace instrumentation and a span processor that exports OpenTelemetry spans to Sentry.
Usage
Setting
auto_instrument_tracestofalseinSentry.initstops Sentry from automatically creating any transactions or spans. (Tracing must still be enabled withtraces_sample_rateortraces_sampler.)Sentry::OpenTelemetry::SpanProcessoris notified whenever an OTel span is started or stopped. In reaction, it starts or stops a Sentry transaction (if the span has no parent) or span. When the transaction is finished it is picked up and processed by Sentry's worker thread like any other transaction would be.If the child service uses the
Sentry::Rack::CaptureExceptionsmiddleware (see order caveat below), then its transactions will be linked inside Sentry with the parent service's transaction.Caveats
Sentry.initmust be run beforeOpenTelemetry::SDK.configure(Sentry'sNet::HTTPmonkey patch to add thesentry-traceheader must run after OTel's monkey-patch that adds an HTTP span)Sentry::Rack::CaptureExceptionsmust be beforeOpenTelemetry::Instrumentation::Rack::Middlewares::TracerMiddlewarein the middleware stack (the hub and scope must be set up before OTel creates any spans)