Implement app-extended-heartbeat telemetry event#8227
Conversation
BenchmarksBenchmark execution time: 2026-03-04 11:15:09 Comparing candidate commit a504db6 in PR branch Found 7 performance improvements and 13 performance regressions! Performance is the same for 159 metrics, 13 unstable metrics. scenario:Benchmarks.Trace.AgentWriterBenchmark.WriteAndFlushEnrichedTraces net6.0
scenario:Benchmarks.Trace.AgentWriterBenchmark.WriteAndFlushEnrichedTraces netcoreapp3.1
scenario:Benchmarks.Trace.Asm.AppSecEncoderBenchmark.EncodeArgs netcoreapp3.1
scenario:Benchmarks.Trace.Asm.AppSecEncoderBenchmark.EncodeLegacyArgs net6.0
scenario:Benchmarks.Trace.Asm.AppSecEncoderBenchmark.EncodeLegacyArgs netcoreapp3.1
scenario:Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark.WriteAndFlushEnrichedTraces net472
scenario:Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark.WriteAndFlushEnrichedTraces netcoreapp3.1
scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSliceWithPool net6.0
scenario:Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearch net472
scenario:Benchmarks.Trace.GraphQLBenchmark.ExecuteAsync net6.0
scenario:Benchmarks.Trace.ILoggerBenchmark.EnrichedLog net6.0
scenario:Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatBenchmark net6.0
scenario:Benchmarks.Trace.Log4netBenchmark.EnrichedLog net472
scenario:Benchmarks.Trace.RedisBenchmark.SendReceive net6.0
scenario:Benchmarks.Trace.SpanBenchmark.StartFinishSpan netcoreapp3.1
scenario:Benchmarks.Trace.SpanBenchmark.StartFinishTwoScopes netcoreapp3.1
|
Execution-Time Benchmarks Report ⏱️Execution-time results for samples comparing This PR (8227) and master. ✅ No regressions detected - check the details below Full Metrics ComparisonFakeDbCommand
HttpMessageHandler
Comparison explanationExecution-time benchmarks measure the whole time it takes to execute a program, and are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are highlighted in **red**. The following thresholds were used for comparing the execution times:
Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard. Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph). Duration chartsFakeDbCommand (.NET Framework 4.8)gantt
title Execution time (ms) FakeDbCommand (.NET Framework 4.8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8227) - mean (69ms) : 68, 71
master - mean (69ms) : 68, 71
section Bailout
This PR (8227) - mean (74ms) : 72, 75
master - mean (73ms) : 72, 74
section CallTarget+Inlining+NGEN
This PR (8227) - mean (1,046ms) : 996, 1096
master - mean (1,046ms) : 981, 1110
FakeDbCommand (.NET Core 3.1)gantt
title Execution time (ms) FakeDbCommand (.NET Core 3.1)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8227) - mean (108ms) : 105, 111
master - mean (108ms) : 105, 111
section Bailout
This PR (8227) - mean (109ms) : 107, 111
master - mean (108ms) : 107, 110
section CallTarget+Inlining+NGEN
This PR (8227) - mean (750ms) : 710, 790
master - mean (750ms) : 716, 784
FakeDbCommand (.NET 6)gantt
title Execution time (ms) FakeDbCommand (.NET 6)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8227) - mean (95ms) : 93, 97
master - mean (95ms) : 93, 98
section Bailout
This PR (8227) - mean (96ms) : 95, 98
master - mean (96ms) : 95, 98
section CallTarget+Inlining+NGEN
This PR (8227) - mean (736ms) : 704, 767
master - mean (730ms) : 696, 765
FakeDbCommand (.NET 8)gantt
title Execution time (ms) FakeDbCommand (.NET 8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8227) - mean (94ms) : 92, 96
master - mean (94ms) : 91, 96
section Bailout
This PR (8227) - mean (95ms) : 94, 97
master - mean (96ms) : 94, 97
section CallTarget+Inlining+NGEN
This PR (8227) - mean (647ms) : 623, 671
master - mean (638ms) : 625, 651
HttpMessageHandler (.NET Framework 4.8)gantt
title Execution time (ms) HttpMessageHandler (.NET Framework 4.8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8227) - mean (195ms) : 189, 200
master - mean (195ms) : 190, 200
section Bailout
This PR (8227) - mean (198ms) : 195, 202
master - mean (198ms) : 196, 201
section CallTarget+Inlining+NGEN
This PR (8227) - mean (1,155ms) : 1096, 1215
master - mean (1,153ms) : 1094, 1211
HttpMessageHandler (.NET Core 3.1)gantt
title Execution time (ms) HttpMessageHandler (.NET Core 3.1)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8227) - mean (278ms) : 274, 283
master - mean (279ms) : 273, 284
section Bailout
This PR (8227) - mean (281ms) : 275, 286
master - mean (280ms) : 275, 285
section CallTarget+Inlining+NGEN
This PR (8227) - mean (950ms) : 920, 981
master - mean (934ms) : 891, 977
HttpMessageHandler (.NET 6)gantt
title Execution time (ms) HttpMessageHandler (.NET 6)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8227) - mean (273ms) : 268, 278
master - mean (270ms) : 264, 277
section Bailout
This PR (8227) - mean (271ms) : 267, 275
master - mean (270ms) : 267, 274
section CallTarget+Inlining+NGEN
This PR (8227) - mean (938ms) : 906, 971
master - mean (935ms) : 904, 965
HttpMessageHandler (.NET 8)gantt
title Execution time (ms) HttpMessageHandler (.NET 8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8227) - mean (271ms) : 265, 276
master - mean (270ms) : 265, 275
section Bailout
This PR (8227) - mean (270ms) : 266, 274
master - mean (270ms) : 265, 276
section CallTarget+Inlining+NGEN
This PR (8227) - mean (833ms) : 814, 853
master - mean (831ms) : 814, 848
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
a3f275f to
75b6415
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 75b6415c27
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
In the PR description:
one of those was meant to be "heartbeat"? |
d668c5f to
f74e019
Compare
| } | ||
|
|
||
| GetData(config, data); | ||
| public ICollection<ConfigurationKeyValue>? GetFullData() |
There was a problem hiding this comment.
nit: seems like return type can never be null?
There was a problem hiding this comment.
Correct, but in the interface it implements it can be currently
| @@ -43,31 +47,48 @@ public void CopyTo(IConfigurationTelemetry destination) | |||
| /// <returns>Null if there are no changes, or the collector is not yet initialized</returns> | |||
| public ICollection<ConfigurationKeyValue>? GetData() | |||
There was a problem hiding this comment.
should this be renamed, as it took me🐢 some time to understand one is the diff one is a full snapshot? maybe GetDataDiff or GetIncrementalData?
There was a problem hiding this comment.
GetIncrementalData makes sense 👍 We use this same naming across all the collectors though, so I'll do a follow up PR to rename all those if that's ok?
There was a problem hiding this comment.
oh didn't realize that, sounds good for a follow up PR!
We could have more than 2000 dependencies (potentially) which would cause issues in the backend. Instead of tackling that, just omit them (as they're not mandatory). We could update this to include them in the future if required
a5f7e67 to
a504db6
Compare
## Summary of changes Renames `GetData` to `GetIncrementalData` to differentiate from `GetFullData()` ## Reason for change In #8227 it was flagged that `GetData()` and `GetFullData()` are easy to confuse. Renaming `GetData` to `GetIncrementalData` should solve it. ## Implementation details Deterministic rename (thank you IDEs, take that AI) ## Test coverage All covered by existing
Summary of changes
Implements the telemetry v2 extended heartbeat spec
Reason for change
We've been asked to implement the extended heartbeat as the backend needs it for data consistency reasons.
Implementation details
Today, after reporting config keys to the telemetry backend we drop them. After this PR, we keep each poll's
List<T>in aList<List<T>>, and use that to send all the data in the extended heartbeat. We already keep the product and dependency data by design, so it's only configs that needed to change.In addition, having the configs available means that we can now also include this in the telemetry flare data, which should be very useful for support cases.
Unfortunately, the fact that the tracer flare can happen concurrently with the extended heartbeat complicates things slightly, and means we have to add some synchronization around the
GetData()/GetFullData()calls. I don't expect any contention there in practice though, so it shouldn't be a big issue.Test coverage
Existing tests pass, and added some additional unit tests for both the tracer flare and the extended heartbeat. System tests will also be added, which will use the configuration key
DD_TELEMETRY_EXTENDED_HEARTBEAT_INTERVALto control the interval.Other details
Initially had 🤖 look at it, but I didn't like what it did at all, so went rogue and hand coded instead (ok it wrote one of the tests)