[Dynamic Instrumentation] DEBUG-4142 Subscribe to settings change in the debugger manager#8048
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: fdbdf39f8e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| var next = new ProbeStatus(_serviceNameProvider(), probeId, status, probeVersion, exception, errorMessage); | ||
| var timedMessage = new TimedMessage | ||
| { | ||
| LastEmit = Clock.UtcNow, |
There was a problem hiding this comment.
Refresh diagnostics service name on re-emit
When the service name changes at runtime (e.g., Tracer.Configure(...)), AddProbeStatus captures the service name once and stores the ProbeStatus in _diagnostics, but GetDiagnostics later re-emits the cached ProbeStatus without refreshing the service field. That means probes installed before a service-name change will keep sending diagnostic statuses under the old service indefinitely unless a new status is generated, which is inconsistent with the stated goal of tracking the current tracer service name. Consider updating cached entries on service-name change or rebuilding the ProbeStatus on each emission.
Useful? React with 👍 / 👎.
This comment has been minimized.
This comment has been minimized.
Execution-Time Benchmarks Report ⏱️Execution-time results for samples comparing This PR (8048) and master. ✅ No regressions detected - check the details below Full Metrics ComparisonFakeDbCommand
HttpMessageHandler
Comparison explanationExecution-time benchmarks measure the whole time it takes to execute a program, and are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are highlighted in **red**. The following thresholds were used for comparing the execution times:
Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard. Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph). Duration chartsFakeDbCommand (.NET Framework 4.8)gantt
title Execution time (ms) FakeDbCommand (.NET Framework 4.8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8048) - mean (69ms) : 67, 70
master - mean (68ms) : 67, 70
section Bailout
This PR (8048) - mean (72ms) : 71, 73
master - mean (72ms) : 71, 73
section CallTarget+Inlining+NGEN
This PR (8048) - mean (1,013ms) : 941, 1086
master - mean (1,008ms) : 937, 1078
FakeDbCommand (.NET Core 3.1)gantt
title Execution time (ms) FakeDbCommand (.NET Core 3.1)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8048) - mean (106ms) : 104, 109
master - mean (105ms) : 103, 108
section Bailout
This PR (8048) - mean (107ms) : 105, 108
master - mean (106ms) : 105, 108
section CallTarget+Inlining+NGEN
This PR (8048) - mean (747ms) : 697, 798
master - mean (744ms) : 704, 784
FakeDbCommand (.NET 6)gantt
title Execution time (ms) FakeDbCommand (.NET 6)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8048) - mean (93ms) : 91, 96
master - mean (93ms) : 91, 96
section Bailout
This PR (8048) - mean (94ms) : 93, 95
master - mean (94ms) : 93, 95
section CallTarget+Inlining+NGEN
This PR (8048) - mean (717ms) : 691, 744
master - mean (714ms) : 687, 742
FakeDbCommand (.NET 8)gantt
title Execution time (ms) FakeDbCommand (.NET 8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8048) - mean (92ms) : 90, 94
master - mean (92ms) : 89, 94
section Bailout
This PR (8048) - mean (93ms) : 92, 94
master - mean (93ms) : 92, 94
section CallTarget+Inlining+NGEN
This PR (8048) - mean (632ms) : 619, 646
master - mean (633ms) : 619, 647
HttpMessageHandler (.NET Framework 4.8)gantt
title Execution time (ms) HttpMessageHandler (.NET Framework 4.8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8048) - mean (193ms) : 189, 197
master - mean (194ms) : 189, 199
section Bailout
This PR (8048) - mean (197ms) : 194, 199
master - mean (198ms) : 194, 202
section CallTarget+Inlining+NGEN
This PR (8048) - mean (1,126ms) : 1061, 1191
master - mean (1,125ms) : 1067, 1183
HttpMessageHandler (.NET Core 3.1)gantt
title Execution time (ms) HttpMessageHandler (.NET Core 3.1)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8048) - mean (277ms) : 272, 283
master - mean (277ms) : 271, 283
section Bailout
This PR (8048) - mean (278ms) : 273, 282
master - mean (277ms) : 273, 280
section CallTarget+Inlining+NGEN
This PR (8048) - mean (936ms) : 897, 975
master - mean (936ms) : 894, 979
HttpMessageHandler (.NET 6)gantt
title Execution time (ms) HttpMessageHandler (.NET 6)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8048) - mean (270ms) : 265, 275
master - mean (271ms) : 262, 279
section Bailout
This PR (8048) - mean (269ms) : 266, 273
master - mean (270ms) : 266, 275
section CallTarget+Inlining+NGEN
This PR (8048) - mean (926ms) : 879, 972
master - mean (928ms) : 883, 973
HttpMessageHandler (.NET 8)gantt
title Execution time (ms) HttpMessageHandler (.NET 8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8048) - mean (270ms) : 262, 278
master - mean (269ms) : 263, 275
section Bailout
This PR (8048) - mean (269ms) : 265, 272
master - mean (270ms) : 265, 274
section CallTarget+Inlining+NGEN
This PR (8048) - mean (832ms) : 814, 849
master - mean (830ms) : 811, 849
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
andrewlock
left a comment
There was a problem hiding this comment.
LGTM overall, I expected a much bigger refactor to be required, but this sidesteps it quite well AFAICT 👍
| { | ||
| // Capture service name on first use, and keep it fixed for the lifetime of this API instance |
There was a problem hiding this comment.
How long does the API instance last? Is it recreated when the settings change? If not, won't it send the wrong settings? And if so, we probably don't need to use a Func<string>?
There was a problem hiding this comment.
we probably don't need to use a Func
That is to delay the service name evaluation as much as possible, so we have the highest chance of using the updated value.
|
|
||
| <ItemGroup> | ||
| <ProjectReference Include="..\Samples.Probes.External\Samples.Probes.External.csproj" /> | ||
| <ProjectReference Include="..\..\..\..\..\src\Datadog.Trace.Manual\Datadog.Trace.Manual.csproj" /> |
There was a problem hiding this comment.
I'm not sure, this might cause issues in CI, but hopefully not 🤔
| private volatile DynamicInstrumentation? _dynamicInstrumentation; | ||
| private int _diState; // 0 = disabled, 1 = initializing, 2 = initialized | ||
| private TracerSettings.SettingsManager? _subscribedSettingsManager; | ||
| private IDisposable? _tracerSettingsSubscription; |
There was a problem hiding this comment.
DebuggerManager lives for the lifetime of the app, right? If so this is fine, but if not, you should make sure to dispose this when DebuggerManager is disposed (it only really matters for tests so that we don't leak perf issues 🙂 )
| // Note: `SymbolsUploader` captures the service name on first use (symbol extraction/upload) and then keeps it fixed for its lifetime, | ||
| // to avoid mixing symbols across services. If the service name changes after that point, the correct behavior would be to stop and | ||
| // recreate the uploader, but that is expensive (symbol extraction/upload) and is intentionally deferred until we have an explicit | ||
| // requirement. |
There was a problem hiding this comment.
Let's raise that in the team
fdbdf39 to
27fd749
Compare
Reason for change
Dynamic Instrumentation (and other debugger products) previously used the tracer’s initial/fallback service name when the user didn’t set DD_SERVICE via environment/config, even if the user later set the service name in code. This caused debugger payloads (snapshots/diagnostics) to be attributed to the wrong service.
This change ensures debugger products track the current tracer service name and behave consistently with the tracer configuration.
Implementation details
DebuggerManager subscribes to TracerSettings.SettingsManager.SubscribeToChanges(...) and updates ServiceName whenever MutableSettings changes.
Debugger components that need the service name use a Func provider so they read the latest value when emitting data.
Test coverage
Added a debugger integration test validating service-name updates via code:
New sample test-run:
ChangeServiceNameInCodeTestemits one snapshot before and one snapshot after calling Tracer.Configure(...) with a new ServiceName.New integration test: ServiceNameChangedInCode_IsReflectedInSnapshots installs two probes, collects two snapshots, and asserts:
the “before” snapshot has a default service name (exe name)
the “after” snapshot has service from code
Other details
Not supported in this PR
DogStatsD / metric probes: service name updates are not currently propagated to the DogStatsD client used by debugger metric probes.
Symbol uploader: the symbol uploader intentionally captures the service name on first use and keeps it fixed for the uploader’s lifetime to avoid mixing symbols across services. Supporting service name changes there would require stopping/recreating the uploader (expensive) and is deferred until there’s an explicit requirement.