S3 Span Pointers#6655
Conversation
Datadog ReportBranch report: ✅ 0 Failed, 153143 Passed, 483 Skipped, 1h 29m 53.36s Total Time |
Execution-Time Benchmarks Report ⏱️Execution-time results for samples comparing the following branches/commits: Execution-time benchmarks measure the whole time it takes to execute a program. And are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are shown in red. The following thresholds were used for comparing the execution times:
Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard. Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph). gantt
title Execution time (ms) FakeDbCommand (.NET Framework 4.6.2)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6655) - mean (70ms) : 66, 73
. : milestone, 70,
master - mean (70ms) : 65, 74
. : milestone, 70,
section CallTarget+Inlining+NGEN
This PR (6655) - mean (1,008ms) : 984, 1032
. : milestone, 1008,
master - mean (1,003ms) : 977, 1029
. : milestone, 1003,
gantt
title Execution time (ms) FakeDbCommand (.NET Core 3.1)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6655) - mean (103ms) : 101, 105
. : milestone, 103,
master - mean (102ms) : 100, 105
. : milestone, 102,
section CallTarget+Inlining+NGEN
This PR (6655) - mean (687ms) : 663, 710
. : milestone, 687,
master - mean (688ms) : 664, 711
. : milestone, 688,
gantt
title Execution time (ms) FakeDbCommand (.NET 6)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6655) - mean (89ms) : 87, 91
. : milestone, 89,
master - mean (89ms) : 87, 91
. : milestone, 89,
section CallTarget+Inlining+NGEN
This PR (6655) - mean (639ms) : 623, 656
. : milestone, 639,
master - mean (640ms) : 626, 654
. : milestone, 640,
gantt
title Execution time (ms) HttpMessageHandler (.NET Framework 4.6.2)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6655) - mean (192ms) : 187, 196
. : milestone, 192,
master - mean (191ms) : 187, 196
. : milestone, 191,
section CallTarget+Inlining+NGEN
This PR (6655) - mean (1,112ms) : 1081, 1144
. : milestone, 1112,
master - mean (1,107ms) : 1081, 1133
. : milestone, 1107,
gantt
title Execution time (ms) HttpMessageHandler (.NET Core 3.1)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6655) - mean (271ms) : 266, 276
. : milestone, 271,
master - mean (271ms) : 266, 275
. : milestone, 271,
section CallTarget+Inlining+NGEN
This PR (6655) - mean (876ms) : 846, 905
. : milestone, 876,
master - mean (877ms) : 842, 913
. : milestone, 877,
gantt
title Execution time (ms) HttpMessageHandler (.NET 6)
dateFormat X
axisFormat %s
todayMarker off
section Baseline
This PR (6655) - mean (262ms) : 259, 265
. : milestone, 262,
master - mean (262ms) : 259, 266
. : milestone, 262,
section CallTarget+Inlining+NGEN
This PR (6655) - mean (852ms) : 822, 882
. : milestone, 852,
master - mean (855ms) : 817, 892
. : milestone, 855,
|
Benchmarks Report for tracer 🐌Benchmarks for #6655 compared to master:
The following thresholds were used for comparing the benchmark speeds:
Allocation changes below 0.5% are ignored. Benchmark detailsBenchmarks.Trace.ActivityBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.AgentWriterBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.AspNetCoreBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.DbCommandBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.ElasticsearchBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.GraphQLBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.HttpClientBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.ILoggerBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.Log4netBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.NLogBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.RedisBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.SerilogBenchmark - Same speed ✔️ Same allocations ✔️Raw results
Benchmarks.Trace.SpanBenchmark - Slower
|
| Benchmark | diff/base | Base Median (ns) | Diff Median (ns) | Modality |
|---|---|---|---|---|
| Benchmarks.Trace.SpanBenchmark.StartFinishScope‑net472 | 1.122 | 843.86 | 946.93 |
Raw results
| Branch | Method | Toolchain | Mean | StdError | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
|---|---|---|---|---|---|---|---|---|---|
| master | StartFinishSpan |
net6.0 | 392ns | 0.434ns | 1.68ns | 0.00798 | 0 | 0 | 576 B |
| master | StartFinishSpan |
netcoreapp3.1 | 606ns | 1.38ns | 5.36ns | 0.0079 | 0 | 0 | 576 B |
| master | StartFinishSpan |
net472 | 589ns | 0.892ns | 3.46ns | 0.0916 | 0 | 0 | 578 B |
| master | StartFinishScope |
net6.0 | 481ns | 0.798ns | 3.09ns | 0.00986 | 0 | 0 | 696 B |
| master | StartFinishScope |
netcoreapp3.1 | 725ns | 1.74ns | 6.72ns | 0.00921 | 0 | 0 | 696 B |
| master | StartFinishScope |
net472 | 842ns | 1.98ns | 7.67ns | 0.104 | 0 | 0 | 658 B |
| #6655 | StartFinishSpan |
net6.0 | 396ns | 0.0955ns | 0.37ns | 0.00802 | 0 | 0 | 576 B |
| #6655 | StartFinishSpan |
netcoreapp3.1 | 612ns | 0.446ns | 1.73ns | 0.00795 | 0 | 0 | 576 B |
| #6655 | StartFinishSpan |
net472 | 633ns | 0.201ns | 0.778ns | 0.0918 | 0 | 0 | 578 B |
| #6655 | StartFinishScope |
net6.0 | 493ns | 0.105ns | 0.395ns | 0.00985 | 0 | 0 | 696 B |
| #6655 | StartFinishScope |
netcoreapp3.1 | 721ns | 0.265ns | 0.993ns | 0.00941 | 0 | 0 | 696 B |
| #6655 | StartFinishScope |
net472 | 947ns | 0.762ns | 2.95ns | 0.104 | 0 | 0 | 658 B |
Benchmarks.Trace.TraceAnnotationsBenchmark - Same speed ✔️ Same allocations ✔️
Raw results
| Branch | Method | Toolchain | Mean | StdError | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
|---|---|---|---|---|---|---|---|---|---|
| master | RunOnMethodBegin |
net6.0 | 615ns | 1.18ns | 4.58ns | 0.00981 | 0 | 0 | 696 B |
| master | RunOnMethodBegin |
netcoreapp3.1 | 922ns | 1.49ns | 5.78ns | 0.00937 | 0 | 0 | 696 B |
| master | RunOnMethodBegin |
net472 | 1.06μs | 2.1ns | 8.12ns | 0.104 | 0 | 0 | 658 B |
| #6655 | RunOnMethodBegin |
net6.0 | 659ns | 0.45ns | 1.74ns | 0.00972 | 0 | 0 | 696 B |
| #6655 | RunOnMethodBegin |
netcoreapp3.1 | 948ns | 0.516ns | 2ns | 0.00914 | 0 | 0 | 696 B |
| #6655 | RunOnMethodBegin |
net472 | 1.07μs | 0.217ns | 0.839ns | 0.104 | 0 | 0 | 658 B |
9786cbe to
3d515cd
Compare
Snapshots difference summaryThe following differences have been observed in committed snapshots. It is meant to help the reviewer. 4 occurrences of : + SpanLinks: [
+ {
+ Attributes: {
+ link.kind: span-pointer,
+ ptr.dir: d,
+ ptr.hash: Guid_4,
+ ptr.kind: aws.s3.object
+ }
+ }
+ ]
4 occurrences of : - aws.requestId: Guid_4,
+ aws.requestId: Guid_5,
4 occurrences of : - aws.requestId: Guid_5,
+ aws.requestId: Guid_6,
4 occurrences of : + SpanLinks: [
+ {
+ Attributes: {
+ link.kind: span-pointer,
+ ptr.dir: d,
+ ptr.hash: Guid_7,
+ ptr.kind: aws.s3.object
+ }
+ }
+ ]
4 occurrences of : - aws.requestId: Guid_6,
+ aws.requestId: Guid_8,
4 occurrences of : - aws.requestId: Guid_7,
+ aws.requestId: Guid_9,
4 occurrences of : - aws.requestId: Guid_8,
+ aws.requestId: Guid_10,
4 occurrences of : - aws.requestId: Guid_9,
+ aws.requestId: Guid_11,
4 occurrences of : - aws.requestId: Guid_10,
+ aws.requestId: Guid_12,
4 occurrences of : - aws.requestId: Guid_11,
+ aws.requestId: Guid_13,
4 occurrences of : - aws.requestId: Guid_12,
+ aws.requestId: Guid_14,
4 occurrences of : + SpanLinks: [
+ {
+ Attributes: {
+ link.kind: span-pointer,
+ ptr.dir: d,
+ ptr.hash: Guid_15,
+ ptr.kind: aws.s3.object
+ }
+ }
+ ]
4 occurrences of : - aws.requestId: Guid_13,
+ aws.requestId: Guid_16,
4 occurrences of : - aws.requestId: Guid_14,
+ aws.requestId: Guid_17,
4 occurrences of : - aws.requestId: Guid_15,
+ aws.requestId: Guid_18,
4 occurrences of : - aws.requestId: Guid_16,
+ aws.requestId: Guid_19,
|
7e81f2c to
ffc46b6
Compare
This comment was marked as resolved.
This comment was marked as resolved.
…| NET5_0_OR_GREATER || NETSTANDARD2_1_OR_GREATER`
Datadog ReportAll test runs ✅ 2 Total Test Services: 0 Failed, 2 Passed Test Services
|
lucaspimentel
left a comment
There was a problem hiding this comment.
Thanks! Someone else should look at SpanPointers and SpanPointersTests since I made a few changes to those that should get reviewed.
|
|
||
| namespace Datadog.Trace.ClrProfiler.Managed.Tests.AutoInstrumentation.AWS.Shared; | ||
|
|
||
| public class SpanPointersTests |
There was a problem hiding this comment.
Could we add tests for null and empty data?
There was a problem hiding this comment.
I've made eTag nullable and added a check+test for it. However, I don't think we need to add logic to handle empty or missing bucket/key, because these values should theoretically always be there. Even if they are missing, all that will happen is we generate a hash with no matching downstream hash (which is not a big deal)
| /// to specify that the new span should not inherit the currently active scope as its parent. | ||
| /// </summary> | ||
| public static readonly ISpanContext None = new ReadOnlySpanContext(traceId: Trace.TraceId.Zero, spanId: 0, serviceName: null); | ||
| public static readonly SpanContext None = new(); |
There was a problem hiding this comment.
Seems like a pretty large change to go from a ReadOnlySpanContext implementation that is an ISpanContext to a full fledged SpanContext
This will change behavior of anything else that uses None potentially.
There was a problem hiding this comment.
Additionally, I am unsure if this needs to be mapped the manual code as well
There was a problem hiding this comment.
I think this could potentially cause different context creation here
There was a problem hiding this comment.
cc @lucaspimentel thoughts on this as I see you recommended it and I am unsure if this was already discussed?
There was a problem hiding this comment.
For context, SpanContext.None and the ReadOnlySpanContext type were added in #2309 and were meant only for public consumption, not for internal use (see the PR description and the related issue for more details). These days, Datadog.Trace.Manual has its own copy of ReadOnlySpanContext here and it's own SpanContext.None to keep the existing API from 2.x to 3.x. I don't think the changes in this PR affect the public API.
As for the internal API, maybe we should keep SpanContext.None and ReadOnlySpanContext.None as two separate properties (of type SpanContext and ReadOnlySpanContext, respectively)? This way the existing code can switch to using ReadOnlySpanContext.None to maintain it's current behavior, and this new code can use SpanContext.None (because it requires a SpanContext, not a ISpanContext). Does that make sense? I can push a quick commit to this PR if you agree.
There was a problem hiding this comment.
Yes that sounds good to me thanks!
And thanks for the historical context around the context!
There was a problem hiding this comment.
Never mind. After trying this out, I'm going to recommend reverting this change for now to be safe and going back to something like SpanContext.Zero, like @nhulston had before.
There's definitely some cleanup/refactoring we can do around SpanContext and ReadOnlySpanContext now that it's all internal APIs, but we should do that in separate PRs.
There was a problem hiding this comment.
I pushed a commit to revert the change and another to rename ZeroContext to Zero. Apologies for all the back and forth 😅 Thank you @bouwkast for catching that.
| span.AddLink(spanLink); | ||
| } | ||
|
|
||
| internal static string ConcatenateComponents(string bucketName, string key, string eTag) |
There was a problem hiding this comment.
Could you get the RFC owners to validate whether this adheres to the expected logic?
There was a problem hiding this comment.
These rules are defined here: https://github.com/DataDog/dd-span-pointer-rules/tree/main/AWS/S3/Object
but we have identical unit tests (and later functional tests once this is out) across the tracers.
…artUploadResponse into a single IS3EtagResponse
lucaspimentel
left a comment
There was a problem hiding this comment.
Changing my review just so this doesn't get merged yet.
Summary of changes
Adds span pointers in S3 for
putObject,copyObject, andcompleteMultipartUploadrequests.Span pointers are similar to Span Links, but for cases when it is impossible to pass the Trace ID and Span ID between the spans that need to be linked.
When the calculated hashes for the upstream and downstream lambdas match, the Datadog backend will automatically link the two traces together.

When clicking on the linked span, a new tab opens linking to the downstream Lambda function that was triggered by this S3 object update.
This feature can be disabled by setting the environment variable
DD_TRACE_AWS_ADD_SPAN_POINTERSto false.Reason for change
Span pointers is a new feature being developed by the Serverless team. This feature already exists in Python & Node, and I'm working on adding it to other runtimes.
Implementation details
Test coverage
Unit tests, integration tests (snapshots), and manual testing on Lambda
Other details
https://docs.google.com/document/d/1I8DLCbUXITaqOK7HLdDCR41WnTSi1WP2OM2J3lGT-rc/edit?tab=t.0#heading=h.ixjggxchsgmp