Skip to content

Fix serialization allocation in .NET Framework and < .NET Core 3.1#7884

Merged
andrewlock merged 3 commits intomasterfrom
andrew/fix-serialization-allocation
Dec 1, 2025
Merged

Fix serialization allocation in .NET Framework and < .NET Core 3.1#7884
andrewlock merged 3 commits intomasterfrom
andrew/fix-serialization-allocation

Conversation

@andrewlock
Copy link
Member

Summary of changes

  • Fixes "incorrect" generated code from TagsList
    • Removes significant additional overhead during serialization
  • "Fix" vendored System.Buffers code to avoid the same issue

Reason for change

The current generated code for TagsList produces something like this:

private static ReadOnlySpan<byte> DbTypeBytes => new byte[] { 167, 100, 98, 46, 116, 121, 112, 101 };

This looks like it's allocating a new byte[] with every invocation, but the compiler actually optimizes this away to be completely zero-allocation, by embedding the array as part of the dll, and then simply returning a ReadOnlySpan wrapper pointing to this fixed data. You can see this if you look at the generated IL:

  .method private hidebysig static specialname valuetype [System.Runtime]System.ReadOnlySpan`1<unsigned int8>
    get_DbTypeBytes() cil managed
  {
    .maxstack 8

    // [20 58 - 20 109]
    IL_0000: ldsflda      int64 '<PrivateImplementationDetails>'::A06A154BE3B860D0B56FA96C93523B732045BA0BCE2FFD4769109575CF1953BF
    IL_0005: ldc.i4.8
    IL_0006: newobj       instance void valuetype [System.Runtime]System.ReadOnlySpan`1<unsigned int8>::.ctor(void*, int32)
    IL_000b: ret

  } // end of method SqlTags::get_DbTypeBytes

However, in .NET Framework, even though we have vendored ReadOnlySpan<T> so we can get some of the benefits (mostly cleaner code), we don't get these benefits. Which means that the above code does generate a new array with every invocation:

  .method private hidebysig static specialname valuetype Datadog.Trace.VendoredMicrosoftCode.System.ReadOnlySpan`1<unsigned int8>
    get_DbTypeBytes() cil managed
  {
    .maxstack 8

    // [20 58 - 20 109]
    IL_0000: ldc.i4.8
    IL_0001: newarr       [netstandard]System.Byte
    IL_0006: dup
    IL_0007: ldtoken      field int64 '<PrivateImplementationDetails>'::A06A154BE3B860D0B56FA96C93523B732045BA0BCE2FFD4769109575CF1953BF
    IL_000c: call         void [netstandard]System.Runtime.CompilerServices.RuntimeHelpers::InitializeArray(class [netstandard]System.Array, valuetype [netstandard]System.RuntimeFieldHandle)
    IL_0011: call         valuetype Datadog.Trace.VendoredMicrosoftCode.System.ReadOnlySpan`1<!0/*unsigned int8*/> valuetype Datadog.Trace.VendoredMicrosoftCode.System.ReadOnlySpan`1<unsigned int8>::op_Implicit(!0/*unsigned int8*/[])
    IL_0016: ret

  } // end of method SqlTags::get_DbTypeBytes

This is... Bad 😅 And it explains the significant serialization overhead identified in #7882 for .NET Framework. I also confirmed this applies to all <.NET Core 3.1 too (because we compile for .NET Standard)

Method Runtime Mean Allocated Alloc Ratio
WriteEnrichedTraces_Before .NET 6.0 488.9 us 110 B 0.001
WriteEnrichedTraces_Before .NET Framework 4.7.2 703.3 us 112537 B 1.000
WriteEnrichedTraces_After .NET 6.0 469.1 us 105 B 0.50
WriteEnrichedTraces_After .NET Framework 4.7.2 703.4 us 208 B 1.00

Implementation details

The fix is to just do what we were doing before #5298 introduced this regression 😄 i.e. generate code like this:

#if NETCOREAPP
    private static ReadOnlySpan<byte> DbTypeBytes => new byte[] { 167, 100, 98, 46, 116, 121, 112, 101 };
#else
    private static readonly byte[] DbTypeBytes = new byte[] { 167, 100, 98, 46, 116, 121, 112, 101 
#endif

Test coverage

This is all covered by existing tests, and the new benchmark shows the improvment

Other details

I found a couple of other places in the vendored code that has the same issue, and fixed them directly in the code. However, this is not ideal, as if we re-vendor, we'll clobber these updates, so we'll need to update the vendoring code too

@andrewlock andrewlock requested review from a team as code owners December 1, 2025 14:57
@andrewlock andrewlock added the type:performance Performance, speed, latency, resource usage (CPU, memory) label Dec 1, 2025
@lucaspimentel lucaspimentel requested a review from a team December 1, 2025 15:09
internal static class Utilities
{

// CUSTOMIZATION TO AVOID unsafe allocation with every access in .NET Framework
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if we want to update this vendored code? not sure if there are "transformers" in the UpdateVendors tool

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I'm just reading your comment , sorry

I found a couple of other places in the vendored code that has the same issue, and fixed them directly in the code. However, this is not ideal, as if we re-vendor, we'll clobber these updates, so we'll need to update the vendoring code too

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this could be in another PR as I see transforms here

really good catch overall 💯

@datadog-official

This comment has been minimized.

@dd-trace-dotnet-ci-bot
Copy link

dd-trace-dotnet-ci-bot bot commented Dec 1, 2025

Execution-Time Benchmarks Report ⏱️

Execution-time results for samples comparing This PR (7884) and master.

✅ No regressions detected - check the details below

Full Metrics Comparison

FakeDbCommand

Metric Master (Mean ± 95% CI) Current (Mean ± 95% CI) Change Status
.NET Framework 4.8 - Baseline
duration76.67 ± (76.43 - 77.46) ms74.58 ± (74.57 - 75.40) ms-2.7%
.NET Framework 4.8 - Bailout
duration79.31 ± (79.23 - 79.86) ms78.58 ± (78.68 - 79.43) ms-0.9%
.NET Framework 4.8 - CallTarget+Inlining+NGEN
duration1063.58 ± (1063.74 - 1071.29) ms1068.03 ± (1068.08 - 1075.94) ms+0.4%✅⬆️
.NET Core 3.1 - Baseline
process.internal_duration_ms23.06 ± (22.99 - 23.12) ms23.05 ± (22.97 - 23.14) ms-0.0%
process.time_to_main_ms88.91 ± (88.56 - 89.26) ms88.31 ± (87.85 - 88.77) ms-0.7%
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed10.92 ± (10.91 - 10.92) MB10.91 ± (10.91 - 10.91) MB-0.1%
runtime.dotnet.threads.count12 ± (12 - 12)12 ± (12 - 12)+0.0%
.NET Core 3.1 - Bailout
process.internal_duration_ms22.94 ± (22.87 - 23.01) ms22.98 ± (22.92 - 23.04) ms+0.2%✅⬆️
process.time_to_main_ms90.14 ± (89.72 - 90.56) ms90.35 ± (89.86 - 90.84) ms+0.2%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed10.96 ± (10.96 - 10.96) MB10.95 ± (10.94 - 10.95) MB-0.1%
runtime.dotnet.threads.count13 ± (13 - 13)13 ± (13 - 13)+0.0%
.NET Core 3.1 - CallTarget+Inlining+NGEN
process.internal_duration_ms225.23 ± (223.66 - 226.80) ms223.19 ± (221.76 - 224.62) ms-0.9%
process.time_to_main_ms506.63 ± (505.42 - 507.84) ms506.43 ± (504.85 - 508.00) ms-0.0%
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed47.85 ± (47.83 - 47.87) MB47.86 ± (47.84 - 47.88) MB+0.0%✅⬆️
runtime.dotnet.threads.count28 ± (28 - 28)28 ± (28 - 28)-0.1%
.NET 6 - Baseline
process.internal_duration_ms21.72 ± (21.66 - 21.78) ms22.02 ± (21.94 - 22.10) ms+1.4%✅⬆️
process.time_to_main_ms76.43 ± (76.14 - 76.72) ms77.64 ± (77.25 - 78.03) ms+1.6%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed10.59 ± (10.59 - 10.59) MB10.63 ± (10.63 - 10.63) MB+0.4%✅⬆️
runtime.dotnet.threads.count10 ± (10 - 10)10 ± (10 - 10)+0.0%
.NET 6 - Bailout
process.internal_duration_ms21.65 ± (21.59 - 21.71) ms22.01 ± (21.94 - 22.09) ms+1.7%✅⬆️
process.time_to_main_ms76.88 ± (76.55 - 77.21) ms78.85 ± (78.46 - 79.24) ms+2.6%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed10.64 ± (10.64 - 10.64) MB10.72 ± (10.72 - 10.73) MB+0.8%✅⬆️
runtime.dotnet.threads.count11 ± (11 - 11)11 ± (11 - 11)+0.0%
.NET 6 - CallTarget+Inlining+NGEN
process.internal_duration_ms213.09 ± (212.04 - 214.14) ms213.48 ± (212.28 - 214.67) ms+0.2%✅⬆️
process.time_to_main_ms470.36 ± (469.32 - 471.39) ms473.17 ± (472.02 - 474.32) ms+0.6%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed48.08 ± (48.06 - 48.11) MB48.08 ± (48.06 - 48.10) MB+0.0%✅⬆️
runtime.dotnet.threads.count28 ± (28 - 28)28 ± (28 - 28)+0.0%✅⬆️
.NET 8 - Baseline
process.internal_duration_ms19.93 ± (19.87 - 20.00) ms20.14 ± (20.06 - 20.22) ms+1.1%✅⬆️
process.time_to_main_ms76.27 ± (75.91 - 76.63) ms76.42 ± (75.98 - 76.85) ms+0.2%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed7.64 ± (7.63 - 7.65) MB7.64 ± (7.63 - 7.65) MB+0.0%✅⬆️
runtime.dotnet.threads.count10 ± (10 - 10)10 ± (10 - 10)+0.0%
.NET 8 - Bailout
process.internal_duration_ms19.90 ± (19.85 - 19.96) ms19.91 ± (19.84 - 19.98) ms+0.0%✅⬆️
process.time_to_main_ms76.91 ± (76.61 - 77.20) ms76.93 ± (76.55 - 77.30) ms+0.0%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed7.69 ± (7.68 - 7.69) MB7.69 ± (7.68 - 7.69) MB-0.0%
runtime.dotnet.threads.count11 ± (11 - 11)11 ± (11 - 11)+0.0%
.NET 8 - CallTarget+Inlining+NGEN
process.internal_duration_ms194.14 ± (192.94 - 195.34) ms193.73 ± (192.40 - 195.06) ms-0.2%
process.time_to_main_ms457.75 ± (456.72 - 458.78) ms459.95 ± (458.49 - 461.40) ms+0.5%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed36.41 ± (36.37 - 36.45) MB36.32 ± (36.28 - 36.37) MB-0.2%
runtime.dotnet.threads.count27 ± (27 - 27)27 ± (27 - 27)-0.1%

HttpMessageHandler

Metric Master (Mean ± 95% CI) Current (Mean ± 95% CI) Change Status
.NET Framework 4.8 - Baseline
duration192.40 ± (192.45 - 193.31) ms192.92 ± (192.89 - 193.70) ms+0.3%✅⬆️
.NET Framework 4.8 - Bailout
duration195.04 ± (194.82 - 195.27) ms196.66 ± (196.54 - 197.01) ms+0.8%✅⬆️
.NET Framework 4.8 - CallTarget+Inlining+NGEN
duration1104.03 ± (1110.09 - 1119.81) ms1111.80 ± (1115.47 - 1124.36) ms+0.7%✅⬆️
.NET Core 3.1 - Baseline
process.internal_duration_ms186.65 ± (186.33 - 186.96) ms188.08 ± (187.81 - 188.34) ms+0.8%✅⬆️
process.time_to_main_ms80.03 ± (79.85 - 80.21) ms80.77 ± (80.54 - 81.00) ms+0.9%✅⬆️
runtime.dotnet.exceptions.count3 ± (3 - 3)3 ± (3 - 3)+0.0%
runtime.dotnet.mem.committed16.19 ± (16.17 - 16.21) MB16.08 ± (16.05 - 16.10) MB-0.7%
runtime.dotnet.threads.count20 ± (19 - 20)20 ± (19 - 20)+0.0%✅⬆️
.NET Core 3.1 - Bailout
process.internal_duration_ms186.67 ± (186.35 - 187.00) ms187.82 ± (187.52 - 188.11) ms+0.6%✅⬆️
process.time_to_main_ms81.53 ± (81.35 - 81.70) ms82.09 ± (81.93 - 82.24) ms+0.7%✅⬆️
runtime.dotnet.exceptions.count3 ± (3 - 3)3 ± (3 - 3)+0.0%
runtime.dotnet.mem.committed16.18 ± (16.11 - 16.25) MB16.13 ± (16.10 - 16.17) MB-0.3%
runtime.dotnet.threads.count21 ± (20 - 21)21 ± (20 - 21)-0.1%
.NET Core 3.1 - CallTarget+Inlining+NGEN
process.internal_duration_ms396.58 ± (393.84 - 399.32) ms399.09 ± (396.49 - 401.68) ms+0.6%✅⬆️
process.time_to_main_ms470.63 ± (470.00 - 471.26) ms474.28 ± (473.65 - 474.91) ms+0.8%✅⬆️
runtime.dotnet.exceptions.count3 ± (3 - 3)3 ± (3 - 3)+0.0%
runtime.dotnet.mem.committed58.58 ± (58.43 - 58.72) MB58.70 ± (58.58 - 58.83) MB+0.2%✅⬆️
runtime.dotnet.threads.count29 ± (29 - 29)29 ± (29 - 30)+0.2%✅⬆️
.NET 6 - Baseline
process.internal_duration_ms190.63 ± (190.38 - 190.88) ms192.59 ± (192.16 - 193.02) ms+1.0%✅⬆️
process.time_to_main_ms69.37 ± (69.26 - 69.49) ms70.35 ± (70.16 - 70.53) ms+1.4%✅⬆️
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed16.26 ± (16.14 - 16.37) MB16.15 ± (16.02 - 16.28) MB-0.7%
runtime.dotnet.threads.count18 ± (18 - 18)18 ± (18 - 19)+0.5%✅⬆️
.NET 6 - Bailout
process.internal_duration_ms190.63 ± (190.25 - 191.01) ms191.98 ± (191.65 - 192.31) ms+0.7%✅⬆️
process.time_to_main_ms70.51 ± (70.39 - 70.64) ms71.21 ± (71.12 - 71.30) ms+1.0%✅⬆️
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed16.04 ± (15.88 - 16.20) MB16.12 ± (15.97 - 16.28) MB+0.5%✅⬆️
runtime.dotnet.threads.count19 ± (19 - 19)19 ± (19 - 19)+0.3%✅⬆️
.NET 6 - CallTarget+Inlining+NGEN
process.internal_duration_ms405.02 ± (402.64 - 407.40) ms408.79 ± (406.69 - 410.89) ms+0.9%✅⬆️
process.time_to_main_ms438.30 ± (437.78 - 438.81) ms443.20 ± (442.59 - 443.80) ms+1.1%✅⬆️
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed58.38 ± (58.22 - 58.54) MB58.94 ± (58.80 - 59.09) MB+1.0%✅⬆️
runtime.dotnet.threads.count29 ± (29 - 30)30 ± (30 - 30)+0.4%✅⬆️
.NET 8 - Baseline
process.internal_duration_ms190.33 ± (189.96 - 190.69) ms189.70 ± (189.37 - 190.03) ms-0.3%
process.time_to_main_ms69.35 ± (69.12 - 69.58) ms69.48 ± (69.29 - 69.67) ms+0.2%✅⬆️
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed11.75 ± (11.72 - 11.77) MB11.73 ± (11.70 - 11.75) MB-0.2%
runtime.dotnet.threads.count18 ± (18 - 18)18 ± (18 - 18)+0.4%✅⬆️
.NET 8 - Bailout
process.internal_duration_ms188.05 ± (187.85 - 188.25) ms189.52 ± (189.27 - 189.78) ms+0.8%✅⬆️
process.time_to_main_ms70.00 ± (69.91 - 70.09) ms70.36 ± (70.28 - 70.45) ms+0.5%✅⬆️
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed11.83 ± (11.80 - 11.86) MB11.69 ± (11.61 - 11.78) MB-1.1%
runtime.dotnet.threads.count19 ± (19 - 19)19 ± (18 - 19)-2.5%
.NET 8 - CallTarget+Inlining+NGEN
process.internal_duration_ms364.31 ± (362.97 - 365.64) ms365.09 ± (363.78 - 366.39) ms+0.2%✅⬆️
process.time_to_main_ms426.43 ± (425.83 - 427.03) ms429.26 ± (428.62 - 429.90) ms+0.7%✅⬆️
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed47.83 ± (47.79 - 47.86) MB47.94 ± (47.91 - 47.97) MB+0.2%✅⬆️
runtime.dotnet.threads.count29 ± (29 - 29)29 ± (29 - 29)+0.1%✅⬆️
Comparison explanation

Execution-time benchmarks measure the whole time it takes to execute a program, and are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are highlighted in **red**. The following thresholds were used for comparing the execution times:

  • Welch test with statistical test for significance of 5%
  • Only results indicating a difference greater than 5% and 5 ms are considered.

Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard.

Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph).

Duration charts
FakeDbCommand (.NET Framework 4.8)
gantt
    title Execution time (ms) FakeDbCommand (.NET Framework 4.8)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (7884) - mean (75ms)  : 69, 81
    master - mean (77ms)  : 70, 84

    section Bailout
    This PR (7884) - mean (79ms)  : 73, 85
    master - mean (80ms)  : 75, 84

    section CallTarget+Inlining+NGEN
    This PR (7884) - mean (1,072ms)  : 1016, 1128
    master - mean (1,068ms)  : 1013, 1122

Loading
FakeDbCommand (.NET Core 3.1)
gantt
    title Execution time (ms) FakeDbCommand (.NET Core 3.1)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (7884) - mean (119ms)  : 109, 129
    master - mean (120ms)  : 112, 127

    section Bailout
    This PR (7884) - mean (121ms)  : 113, 129
    master - mean (121ms)  : 114, 127

    section CallTarget+Inlining+NGEN
    This PR (7884) - mean (767ms)  : 718, 815
    master - mean (773ms)  : 733, 813

Loading
FakeDbCommand (.NET 6)
gantt
    title Execution time (ms) FakeDbCommand (.NET 6)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (7884) - mean (107ms)  : 98, 116
    master - mean (105ms)  : 98, 112

    section Bailout
    This PR (7884) - mean (108ms)  : 100, 117
    master - mean (106ms)  : 100, 111

    section CallTarget+Inlining+NGEN
    This PR (7884) - mean (716ms)  : 684, 748
    master - mean (711ms)  : 682, 740

Loading
FakeDbCommand (.NET 8)
gantt
    title Execution time (ms) FakeDbCommand (.NET 8)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (7884) - mean (105ms)  : 97, 113
    master - mean (105ms)  : 97, 113

    section Bailout
    This PR (7884) - mean (105ms)  : 98, 112
    master - mean (105ms)  : 100, 111

    section CallTarget+Inlining+NGEN
    This PR (7884) - mean (688ms)  : 650, 726
    master - mean (687ms)  : 650, 725

Loading
HttpMessageHandler (.NET Framework 4.8)
gantt
    title Execution time (ms) HttpMessageHandler (.NET Framework 4.8)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (7884) - mean (193ms)  : 189, 197
    master - mean (193ms)  : 188, 197

    section Bailout
    This PR (7884) - mean (197ms)  : 195, 199
    master - mean (195ms)  : 193, 197

    section CallTarget+Inlining+NGEN
    This PR (7884) - mean (1,120ms)  : 1056, 1184
    master - mean (1,115ms)  : 1043, 1187

Loading
HttpMessageHandler (.NET Core 3.1)
gantt
    title Execution time (ms) HttpMessageHandler (.NET Core 3.1)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (7884) - mean (277ms)  : 273, 281
    master - mean (275ms)  : 270, 280

    section Bailout
    This PR (7884) - mean (278ms)  : 274, 283
    master - mean (276ms)  : 272, 280

    section CallTarget+Inlining+NGEN
    This PR (7884) - mean (908ms)  : 862, 954
    master - mean (903ms)  : 854, 951

Loading
HttpMessageHandler (.NET 6)
gantt
    title Execution time (ms) HttpMessageHandler (.NET 6)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (7884) - mean (271ms)  : 265, 277
    master - mean (268ms)  : 265, 272

    section Bailout
    This PR (7884) - mean (271ms)  : 266, 276
    master - mean (269ms)  : 264, 274

    section CallTarget+Inlining+NGEN
    This PR (7884) - mean (885ms)  : 846, 923
    master - mean (875ms)  : 834, 917

Loading
HttpMessageHandler (.NET 8)
gantt
    title Execution time (ms) HttpMessageHandler (.NET 8)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (7884) - mean (269ms)  : 263, 274
    master - mean (269ms)  : 264, 274

    section Bailout
    This PR (7884) - mean (269ms)  : 266, 273
    master - mean (267ms)  : 265, 270

    section CallTarget+Inlining+NGEN
    This PR (7884) - mean (825ms)  : 802, 847
    master - mean (822ms)  : 802, 841

Loading

Base automatically changed from andrew/fix-agent-writer-benchmark to master December 1, 2025 17:51
@andrewlock andrewlock force-pushed the andrew/fix-serialization-allocation branch from 47480b3 to 0407896 Compare December 1, 2025 17:55
@pr-commenter
Copy link

pr-commenter bot commented Dec 1, 2025

Benchmarks

Benchmarks Report for benchmark platform 🐌

Benchmarks for #7884 compared to master:

  • 1 benchmarks are faster, with geometric mean 1.211
  • 4 benchmarks are slower, with geometric mean 1.948
  • 5 benchmarks have fewer allocations
  • 7 benchmarks have more allocations

The following thresholds were used for comparing the benchmark speeds:

  • Mann–Whitney U test with statistical test for significance of 5%
  • Only results indicating a difference greater than 10% and 0.3 ns are considered.

Allocation changes below 0.5% are ignored.

Benchmark details

Benchmarks.Trace.ActivityBenchmark - Same speed ✔️ More allocations ⚠️

More allocations ⚠️ in #7884

Benchmark Base Allocated Diff Allocated Change Change %
Benchmarks.Trace.ActivityBenchmark.StartStopWithChild‑net6.0 5.41 KB 5.52 KB 110 B 2.04%
Benchmarks.Trace.ActivityBenchmark.StartStopWithChild‑netcoreapp3.1 5.67 KB 5.7 KB 31 B 0.55%

Fewer allocations 🎉 in #7884

Benchmark Base Allocated Diff Allocated Change Change %
Benchmarks.Trace.ActivityBenchmark.StartStopWithChild‑net472 6.09 KB 5.97 KB -123 B -2.02%

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master StartStopWithChild net6.0 10.7μs 60.6ns 407ns 0 0 0 5.41 KB
master StartStopWithChild netcoreapp3.1 14.5μs 75.9ns 364ns 0 0 0 5.67 KB
master StartStopWithChild net472 21.7μs 82.5ns 330ns 0.983 0.328 0.109 6.09 KB
#7884 StartStopWithChild net6.0 10.9μs 59.9ns 369ns 0 0 0 5.52 KB
#7884 StartStopWithChild netcoreapp3.1 13.8μs 65.4ns 262ns 0 0 0 5.7 KB
#7884 StartStopWithChild net472 21.5μs 114ns 616ns 0.958 0.426 0.106 5.97 KB
Benchmarks.Trace.AgentWriterBenchmark - Same speed ✔️ Fewer allocations 🎉

Fewer allocations 🎉 in #7884

Benchmark Base Allocated Diff Allocated Change Change %
Benchmarks.Trace.AgentWriterBenchmark.WriteAndFlushEnrichedTraces‑net472 115.64 KB 3.33 KB -112.31 KB -97.12%

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master WriteAndFlushEnrichedTraces net6.0 1.28ms 300ns 1.16μs 0 0 0 2.71 KB
master WriteAndFlushEnrichedTraces netcoreapp3.1 1.37ms 2.49μs 9.65μs 0 0 0 2.7 KB
master WriteAndFlushEnrichedTraces net472 1.8ms 216ns 750ns 17.9 0 0 115.64 KB
#7884 WriteAndFlushEnrichedTraces net6.0 1.26ms 136ns 471ns 0 0 0 2.7 KB
#7884 WriteAndFlushEnrichedTraces netcoreapp3.1 1.38ms 536ns 2.08μs 0 0 0 2.7 KB
#7884 WriteAndFlushEnrichedTraces net472 1.7ms 182ns 631ns 0 0 0 3.33 KB
Benchmarks.Trace.Asm.AppSecBodyBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master AllCycleSimpleBody net6.0 1.07μs 4.65ns 17.4ns 0 0 0 1.22 KB
master AllCycleSimpleBody netcoreapp3.1 1.4μs 7.8ns 48.1ns 0 0 0 1.2 KB
master AllCycleSimpleBody net472 1.06μs 0.203ns 0.785ns 0.19 0 0 1.23 KB
master AllCycleMoreComplexBody net6.0 7.1μs 34ns 132ns 0 0 0 4.72 KB
master AllCycleMoreComplexBody netcoreapp3.1 8.97μs 44.8ns 190ns 0 0 0 4.62 KB
master AllCycleMoreComplexBody net472 7.59μs 4.18ns 15.6ns 0.721 0 0 4.74 KB
master ObjectExtractorSimpleBody net6.0 331ns 1.6ns 6.4ns 0 0 0 280 B
master ObjectExtractorSimpleBody netcoreapp3.1 392ns 2.22ns 14.6ns 0 0 0 272 B
master ObjectExtractorSimpleBody net472 304ns 0.454ns 1.76ns 0.0446 0 0 281 B
master ObjectExtractorMoreComplexBody net6.0 6.39μs 29.6ns 118ns 0 0 0 3.78 KB
master ObjectExtractorMoreComplexBody netcoreapp3.1 7.79μs 36.4ns 150ns 0 0 0 3.69 KB
master ObjectExtractorMoreComplexBody net472 6.64μs 3.25ns 12.6ns 0.597 0 0 3.8 KB
#7884 AllCycleSimpleBody net6.0 1.06μs 5.95ns 39ns 0 0 0 1.22 KB
#7884 AllCycleSimpleBody netcoreapp3.1 1.51μs 8.31ns 50.6ns 0 0 0 1.2 KB
#7884 AllCycleSimpleBody net472 1.06μs 0.294ns 1.1ns 0.191 0 0 1.23 KB
#7884 AllCycleMoreComplexBody net6.0 7.16μs 6.7ns 25.9ns 0 0 0 4.72 KB
#7884 AllCycleMoreComplexBody netcoreapp3.1 9μs 44.6ns 173ns 0 0 0 4.62 KB
#7884 AllCycleMoreComplexBody net472 7.66μs 6.53ns 25.3ns 0.726 0 0 4.74 KB
#7884 ObjectExtractorSimpleBody net6.0 325ns 1.61ns 7.03ns 0 0 0 280 B
#7884 ObjectExtractorSimpleBody netcoreapp3.1 389ns 2.21ns 15.3ns 0 0 0 272 B
#7884 ObjectExtractorSimpleBody net472 299ns 0.239ns 0.895ns 0.0435 0 0 281 B
#7884 ObjectExtractorMoreComplexBody net6.0 6.46μs 7.47ns 28.9ns 0 0 0 3.78 KB
#7884 ObjectExtractorMoreComplexBody netcoreapp3.1 7.81μs 38.5ns 177ns 0 0 0 3.69 KB
#7884 ObjectExtractorMoreComplexBody net472 6.72μs 6.02ns 23.3ns 0.602 0 0 3.8 KB
Benchmarks.Trace.Asm.AppSecEncoderBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EncodeArgs net6.0 76.6μs 249ns 964ns 0 0 0 32.4 KB
master EncodeArgs netcoreapp3.1 96.8μs 201ns 777ns 0 0 0 32.4 KB
master EncodeArgs net472 110μs 6.24ns 21.6ns 4.96 0 0 32.51 KB
master EncodeLegacyArgs net6.0 146μs 64.8ns 234ns 0 0 0 2.15 KB
master EncodeLegacyArgs netcoreapp3.1 199μs 240ns 931ns 0 0 0 2.14 KB
master EncodeLegacyArgs net472 264μs 198ns 713ns 0 0 0 2.16 KB
#7884 EncodeArgs net6.0 77.3μs 56.4ns 218ns 0 0 0 32.4 KB
#7884 EncodeArgs netcoreapp3.1 97.7μs 289ns 1.12μs 0 0 0 32.4 KB
#7884 EncodeArgs net472 109μs 9.6ns 35.9ns 4.91 0 0 32.51 KB
#7884 EncodeLegacyArgs net6.0 146μs 12.3ns 44.5ns 0 0 0 2.15 KB
#7884 EncodeLegacyArgs netcoreapp3.1 201μs 132ns 475ns 0 0 0 2.14 KB
#7884 EncodeLegacyArgs net472 263μs 343ns 1.33μs 0 0 0 2.16 KB
Benchmarks.Trace.Asm.AppSecWafBenchmark - Same speed ✔️ Fewer allocations 🎉

Fewer allocations 🎉 in #7884

Benchmark Base Allocated Diff Allocated Change Change %
Benchmarks.Trace.Asm.AppSecWafBenchmark.RunWafRealisticBenchmark‑net6.0 5.82 KB 5.48 KB -336 B -5.78%

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master RunWafRealisticBenchmark net6.0 425μs 648ns 2.34μs 0 0 0 5.82 KB
master RunWafRealisticBenchmark netcoreapp3.1 473μs 2.91μs 27.6μs 0 0 0 4.58 KB
master RunWafRealisticBenchmark net472 497μs 732ns 2.83μs 0 0 0 0 b
master RunWafRealisticBenchmarkWithAttack net6.0 316μs 1.19μs 4.28μs 0 0 0 3.17 KB
master RunWafRealisticBenchmarkWithAttack netcoreapp3.1 359μs 3.06μs 29μs 0 0 0 2.32 KB
master RunWafRealisticBenchmarkWithAttack net472 372μs 505ns 1.89μs 0 0 0 0 b
#7884 RunWafRealisticBenchmark net6.0 438μs 1.52μs 6.09μs 0 0 0 5.48 KB
#7884 RunWafRealisticBenchmark netcoreapp3.1 452μs 1.38μs 4.98μs 0 0 0 4.58 KB
#7884 RunWafRealisticBenchmark net472 499μs 482ns 1.8μs 0 0 0 0 b
#7884 RunWafRealisticBenchmarkWithAttack net6.0 312μs 667ns 2.41μs 0 0 0 3.17 KB
#7884 RunWafRealisticBenchmarkWithAttack netcoreapp3.1 347μs 3.28μs 30.8μs 0 0 0 2.32 KB
#7884 RunWafRealisticBenchmarkWithAttack net472 372μs 152ns 525ns 0 0 0 0 b
Benchmarks.Trace.AspNetCoreBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendRequest net6.0 60.1μs 58.7ns 227ns 0 0 0 14.52 KB
master SendRequest netcoreapp3.1 72.5μs 118ns 407ns 0 0 0 17.42 KB
master SendRequest net472 0.158ns 0.00432ns 0.0167ns 0 0 0 0 b
#7884 SendRequest net6.0 61.4μs 52.3ns 202ns 0 0 0 14.52 KB
#7884 SendRequest netcoreapp3.1 72.8μs 381ns 1.86μs 0 0 0 17.42 KB
#7884 SendRequest net472 0.141ns 0.00332ns 0.0128ns 0 0 0 0 b
Benchmarks.Trace.CharSliceBenchmark - Slower ⚠️ Fewer allocations 🎉

Slower ⚠️ in #7884

Benchmark diff/base Base Median (ns) Diff Median (ns) Modality
Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSlice‑netcoreapp3.1 1.649 1,705,800.00 2,812,450.00

Faster 🎉 in #7884

Benchmark base/diff Base Median (ns) Diff Median (ns) Modality
Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSliceWithPool‑net472 1.211 1,368,600.00 1,130,100.00

Fewer allocations 🎉 in #7884

Benchmark Base Allocated Diff Allocated Change Change %
Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSlice‑net6.0 640 B 304 B -336 B -52.50%
Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSliceWithPool‑net6.0 640 B 304 B -336 B -52.50%

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master OriginalCharSlice net6.0 1.94ms 1.22μs 4.57μs 0 0 0 640.64 KB
master OriginalCharSlice netcoreapp3.1 4.1ms 2.12μs 8.22μs 0 0 0 640.1 KB
master OriginalCharSlice net472 2.63ms 958ns 3.71μs 0 0 0 647.17 KB
master OptimizedCharSlice net6.0 1.52ms 376ns 1.41μs 0 0 0 640 B
master OptimizedCharSlice netcoreapp3.1 1.71ms 5.66μs 35.8μs 0 0 0 104 B
master OptimizedCharSlice net472 1.94ms 593ns 2.3μs 0 0 0 0 b
master OptimizedCharSliceWithPool net6.0 1.05ms 588ns 2.2μs 0 0 0 640 B
master OptimizedCharSliceWithPool netcoreapp3.1 1.86ms 1.95μs 7.56μs 0 0 0 104 B
master OptimizedCharSliceWithPool net472 1.37ms 1.35μs 5.23μs 0 0 0 0 b
#7884 OriginalCharSlice net6.0 1.97ms 1.37μs 4.93μs 0 0 0 640.3 KB
#7884 OriginalCharSlice netcoreapp3.1 3.92ms 3.09μs 11.6μs 0 0 0 640.1 KB
#7884 OriginalCharSlice net472 2.59ms 802ns 2.89μs 0 0 0 647.17 KB
#7884 OptimizedCharSlice net6.0 1.52ms 777ns 3.01μs 0 0 0 304 B
#7884 OptimizedCharSlice netcoreapp3.1 2.81ms 414ns 1.55μs 0 0 0 104 B
#7884 OptimizedCharSlice net472 2.13ms 558ns 2.01μs 0 0 0 0 b
#7884 OptimizedCharSliceWithPool net6.0 1.02ms 745ns 2.88μs 0 0 0 304 B
#7884 OptimizedCharSliceWithPool netcoreapp3.1 1.86ms 1.69μs 6.56μs 0 0 0 104 B
#7884 OptimizedCharSliceWithPool net472 1.13ms 605ns 2.34μs 0 0 0 0 b
Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark - Slower ⚠️ More allocations ⚠️

Slower ⚠️ in #7884

Benchmark diff/base Base Median (ns) Diff Median (ns) Modality
Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark.WriteAndFlushEnrichedTraces‑net6.0 1.130 664,902.08 751,347.57

More allocations ⚠️ in #7884

Benchmark Base Allocated Diff Allocated Change Change %
Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark.WriteAndFlushEnrichedTraces‑net6.0 41.67 KB 42.05 KB 385 B 0.92%

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master WriteAndFlushEnrichedTraces net6.0 665μs 451ns 1.75μs 0 0 0 41.67 KB
master WriteAndFlushEnrichedTraces netcoreapp3.1 722μs 979ns 3.79μs 0 0 0 41.9 KB
master WriteAndFlushEnrichedTraces net472 958μs 4.64μs 18.6μs 4.81 0 0 55.79 KB
#7884 WriteAndFlushEnrichedTraces net6.0 753μs 1.93μs 7.24μs 0 0 0 42.05 KB
#7884 WriteAndFlushEnrichedTraces netcoreapp3.1 704μs 2.1μs 8.38μs 0 0 0 41.8 KB
#7884 WriteAndFlushEnrichedTraces net472 920μs 2.74μs 10.6μs 8.33 0 0 56.05 KB
Benchmarks.Trace.DbCommandBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master ExecuteNonQuery net6.0 1.91μs 8.99ns 35.9ns 0 0 0 1.02 KB
master ExecuteNonQuery netcoreapp3.1 2.63μs 2.44ns 9.44ns 0 0 0 1.02 KB
master ExecuteNonQuery net472 2.72μs 1.02ns 3.69ns 0.152 0 0 987 B
#7884 ExecuteNonQuery net6.0 1.91μs 1.66ns 6.41ns 0 0 0 1.02 KB
#7884 ExecuteNonQuery netcoreapp3.1 2.61μs 6.88ns 24.8ns 0 0 0 1.02 KB
#7884 ExecuteNonQuery net472 2.82μs 2.67ns 10.3ns 0.155 0 0 987 B
Benchmarks.Trace.ElasticsearchBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master CallElasticsearch net6.0 1.72μs 8.72ns 40.9ns 0 0 0 1.03 KB
master CallElasticsearch netcoreapp3.1 2.37μs 8.9ns 32.1ns 0 0 0 1.03 KB
master CallElasticsearch net472 3.45μs 3.03ns 11.7ns 0.156 0 0 1.04 KB
master CallElasticsearchAsync net6.0 1.78μs 6.89ns 26.7ns 0 0 0 1.01 KB
master CallElasticsearchAsync netcoreapp3.1 2.48μs 10.5ns 40.8ns 0 0 0 1.08 KB
master CallElasticsearchAsync net472 3.67μs 4.49ns 17.4ns 0.167 0 0 1.1 KB
#7884 CallElasticsearch net6.0 1.73μs 7.99ns 30.9ns 0 0 0 1.03 KB
#7884 CallElasticsearch netcoreapp3.1 2.33μs 7.91ns 29.6ns 0 0 0 1.03 KB
#7884 CallElasticsearch net472 3.46μs 8.54ns 33.1ns 0.158 0 0 1.04 KB
#7884 CallElasticsearchAsync net6.0 1.85μs 9.05ns 38.4ns 0 0 0 1.01 KB
#7884 CallElasticsearchAsync netcoreapp3.1 2.49μs 11.9ns 46.2ns 0 0 0 1.08 KB
#7884 CallElasticsearchAsync net472 3.61μs 2.97ns 11.5ns 0.162 0 0 1.1 KB
Benchmarks.Trace.GraphQLBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master ExecuteAsync net6.0 1.92μs 9.55ns 39.4ns 0 0 0 952 B
master ExecuteAsync netcoreapp3.1 2.47μs 6.9ns 26.7ns 0 0 0 952 B
master ExecuteAsync net472 2.59μs 0.737ns 2.66ns 0.143 0 0 915 B
#7884 ExecuteAsync net6.0 1.82μs 7.18ns 27.8ns 0 0 0 952 B
#7884 ExecuteAsync netcoreapp3.1 2.39μs 7.1ns 27.5ns 0 0 0 952 B
#7884 ExecuteAsync net472 2.57μs 2.31ns 8.95ns 0.142 0 0 915 B
Benchmarks.Trace.HttpClientBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendAsync net6.0 6.9μs 7.53ns 27.1ns 0 0 0 2.36 KB
master SendAsync netcoreapp3.1 8.76μs 29.7ns 115ns 0 0 0 2.9 KB
master SendAsync net472 12.1μs 18.2ns 70.5ns 0.481 0 0 3.18 KB
#7884 SendAsync net6.0 7.07μs 11.4ns 44.2ns 0 0 0 2.36 KB
#7884 SendAsync netcoreapp3.1 8.45μs 11.2ns 43.3ns 0 0 0 2.9 KB
#7884 SendAsync net472 12.2μs 10.7ns 41.4ns 0.489 0 0 3.18 KB
Benchmarks.Trace.Iast.StringAspectsBenchmark - Slower ⚠️ More allocations ⚠️

Slower ⚠️ in #7884

Benchmark diff/base Base Median (ns) Diff Median (ns) Modality
Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatAspectBenchmark‑net6.0 2.908 481,800.00 1,401,200.00
Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatAspectBenchmark‑netcoreapp3.1 2.658 570,300.00 1,515,700.00

More allocations ⚠️ in #7884

Benchmark Base Allocated Diff Allocated Change Change %
Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatAspectBenchmark‑net6.0 277.02 KB 340.57 KB 63.54 KB 22.94%
Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatAspectBenchmark‑netcoreapp3.1 278.32 KB 337.42 KB 59.1 KB 21.24%
Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatBenchmark‑net472 57.34 KB 65.54 KB 8.19 KB 14.29%
Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatAspectBenchmark‑net472 278.53 KB 286.72 KB 8.19 KB 2.94%

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master StringConcatBenchmark net6.0 46.2μs 332ns 3.12μs 0 0 0 44.13 KB
master StringConcatBenchmark netcoreapp3.1 51.5μs 420ns 4.01μs 0 0 0 42.68 KB
master StringConcatBenchmark net472 57.8μs 167ns 601ns 0 0 0 57.34 KB
master StringConcatAspectBenchmark net6.0 483μs 2.07μs 9.27μs 0 0 0 277.02 KB
master StringConcatAspectBenchmark netcoreapp3.1 572μs 2.12μs 7.65μs 0 0 0 278.32 KB
master StringConcatAspectBenchmark net472 409μs 2.26μs 16μs 0 0 0 278.53 KB
#7884 StringConcatBenchmark net6.0 44.7μs 202ns 700ns 0 0 0 44.26 KB
#7884 StringConcatBenchmark netcoreapp3.1 49.2μs 284ns 2.18μs 0 0 0 42.64 KB
#7884 StringConcatBenchmark net472 56.8μs 230ns 829ns 0 0 0 65.54 KB
#7884 StringConcatAspectBenchmark net6.0 1.41ms 3.45μs 12.4μs 0 0 0 340.57 KB
#7884 StringConcatAspectBenchmark netcoreapp3.1 1.51ms 2.83μs 10.2μs 0 0 0 337.42 KB
#7884 StringConcatAspectBenchmark net472 403μs 2.06μs 9.86μs 0 0 0 286.72 KB
Benchmarks.Trace.ILoggerBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 2.65μs 14.5ns 84.8ns 0 0 0 1.7 KB
master EnrichedLog netcoreapp3.1 3.57μs 16.9ns 69.7ns 0 0 0 1.7 KB
master EnrichedLog net472 3.91μs 2.7ns 10.1ns 0.254 0 0 1.64 KB
#7884 EnrichedLog net6.0 2.7μs 7.67ns 29.7ns 0 0 0 1.7 KB
#7884 EnrichedLog netcoreapp3.1 3.52μs 17.3ns 69.1ns 0 0 0 1.7 KB
#7884 EnrichedLog net472 3.93μs 2.73ns 10.6ns 0.256 0 0 1.64 KB
Benchmarks.Trace.Log4netBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 125μs 464ns 1.8μs 0 0 0 4.31 KB
master EnrichedLog netcoreapp3.1 128μs 431ns 1.67μs 0 0 0 4.31 KB
master EnrichedLog net472 167μs 109ns 409ns 0 0 0 4.52 KB
#7884 EnrichedLog net6.0 123μs 50.9ns 184ns 0 0 0 4.31 KB
#7884 EnrichedLog netcoreapp3.1 129μs 162ns 606ns 0 0 0 4.31 KB
#7884 EnrichedLog net472 167μs 88.1ns 330ns 0 0 0 4.52 KB
Benchmarks.Trace.NLogBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 5.07μs 15.9ns 57.2ns 0 0 0 2.26 KB
master EnrichedLog netcoreapp3.1 6.95μs 19.8ns 76.5ns 0 0 0 2.26 KB
master EnrichedLog net472 7.69μs 10.7ns 41.6ns 0.307 0 0 2.08 KB
#7884 EnrichedLog net6.0 5.11μs 3.89ns 15.1ns 0 0 0 2.26 KB
#7884 EnrichedLog netcoreapp3.1 7.02μs 24.7ns 95.6ns 0 0 0 2.26 KB
#7884 EnrichedLog net472 7.67μs 9.55ns 37ns 0.307 0 0 2.08 KB
Benchmarks.Trace.RedisBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendReceive net6.0 2.02μs 10.9ns 58.7ns 0 0 0 1.2 KB
master SendReceive netcoreapp3.1 2.59μs 11.8ns 45.6ns 0 0 0 1.2 KB
master SendReceive net472 3.1μs 0.789ns 3.05ns 0.188 0 0 1.2 KB
#7884 SendReceive net6.0 1.96μs 10.3ns 48.3ns 0 0 0 1.2 KB
#7884 SendReceive netcoreapp3.1 2.76μs 12.8ns 49.7ns 0 0 0 1.2 KB
#7884 SendReceive net472 3.18μs 5.07ns 19ns 0.191 0 0 1.2 KB
Benchmarks.Trace.SerilogBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 4.32μs 8.14ns 31.5ns 0 0 0 1.58 KB
master EnrichedLog netcoreapp3.1 5.75μs 18.7ns 72.6ns 0 0 0 1.63 KB
master EnrichedLog net472 6.51μs 6.34ns 24.5ns 0.292 0 0 2.03 KB
#7884 EnrichedLog net6.0 4.34μs 4.39ns 17ns 0 0 0 1.58 KB
#7884 EnrichedLog netcoreapp3.1 5.73μs 14ns 54.4ns 0 0 0 1.63 KB
#7884 EnrichedLog net472 6.37μs 10.2ns 39.4ns 0.316 0 0 2.03 KB
Benchmarks.Trace.SpanBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master StartFinishSpan net6.0 805ns 2.44ns 9.44ns 0 0 0 576 B
master StartFinishSpan netcoreapp3.1 966ns 5.11ns 27.5ns 0 0 0 576 B
master StartFinishSpan net472 907ns 0.135ns 0.504ns 0.0906 0 0 578 B
master StartFinishScope net6.0 985ns 0.814ns 3.04ns 0 0 0 696 B
master StartFinishScope netcoreapp3.1 1.14μs 5.96ns 28.6ns 0 0 0 696 B
master StartFinishScope net472 1.12μs 0.0802ns 0.311ns 0.101 0 0 658 B
master StartFinishTwoScopes net6.0 1.78μs 0.275ns 1.06ns 0 0 0 1.19 KB
master StartFinishTwoScopes netcoreapp3.1 2.19μs 11.9ns 64ns 0 0 0 1.19 KB
master StartFinishTwoScopes net472 2.19μs 3.39ns 13.1ns 0.163 0 0 1.08 KB
#7884 StartFinishSpan net6.0 793ns 3.38ns 13.1ns 0 0 0 576 B
#7884 StartFinishSpan netcoreapp3.1 990ns 4.59ns 17.8ns 0 0 0 576 B
#7884 StartFinishSpan net472 912ns 0.102ns 0.355ns 0.0913 0 0 578 B
#7884 StartFinishScope net6.0 915ns 4.97ns 27.7ns 0 0 0 696 B
#7884 StartFinishScope netcoreapp3.1 1.16μs 6.2ns 33.9ns 0 0 0 696 B
#7884 StartFinishScope net472 1.11μs 0.574ns 2.22ns 0.0996 0 0 658 B
#7884 StartFinishTwoScopes net6.0 1.75μs 0.623ns 2.41ns 0 0 0 1.19 KB
#7884 StartFinishTwoScopes netcoreapp3.1 2.23μs 9.51ns 36.8ns 0 0 0 1.19 KB
#7884 StartFinishTwoScopes net472 2.14μs 2.05ns 7.69ns 0.163 0 0 1.08 KB
Benchmarks.Trace.TraceAnnotationsBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master RunOnMethodBegin net6.0 1.08μs 0.608ns 2.35ns 0 0 0 696 B
master RunOnMethodBegin netcoreapp3.1 1.5μs 4.56ns 17.1ns 0 0 0 696 B
master RunOnMethodBegin net472 1.43μs 0.324ns 1.21ns 0.101 0 0 658 B
#7884 RunOnMethodBegin net6.0 1.1μs 4.77ns 17.8ns 0 0 0 696 B
#7884 RunOnMethodBegin netcoreapp3.1 1.45μs 6.18ns 23.9ns 0 0 0 696 B
#7884 RunOnMethodBegin net472 1.41μs 0.661ns 2.56ns 0.0998 0 0 658 B

@andrewlock andrewlock merged commit 5361654 into master Dec 1, 2025
151 checks passed
@andrewlock andrewlock deleted the andrew/fix-serialization-allocation branch December 1, 2025 20:08
@github-actions github-actions bot added this to the vNext-v3 milestone Dec 1, 2025
andrewlock added a commit that referenced this pull request Dec 5, 2025
andrewlock added a commit that referenced this pull request Dec 9, 2025
## Summary of changes

Encodes changes made manually to vendored code in #7884

## Reason for change

In #7884, we manually fixed some problematic code in vendored code. This
PR encodes those changes in the vendors tool, ostensibly so that
re-running the tool recreates the currently vendored code.

Unfortunately, that's not the case. The code that's been vendored for
all of the `System.` libraries _cannot_ be regenerated using the tool.

## Implementation details

Added the transforms, and confirm they work as expected. However, it's a
bit moot as we currently can't regenerate the existing code.

## Test coverage

Manually tested that the new changes were reproduced, then reverted,
because it breaks a bunch of existing vendored code 😅
andrewlock added a commit that referenced this pull request Dec 9, 2025
…7884)

## Summary of changes

- Fixes "incorrect" generated code from `TagsList`
  - Removes significant additional overhead during serialization
- "Fix" vendored System.Buffers code to avoid the same issue

## Reason for change

The current generated code for `TagsList` produces something like this:

```csharp
private static ReadOnlySpan<byte> DbTypeBytes => new byte[] { 167, 100, 98, 46, 116, 121, 112, 101 };
```

This _looks_ like it's allocating a new `byte[]` with every invocation,
but the compiler actually optimizes this away to be completely
zero-allocation, by embedding the array as part of the dll, and then
simply returning a `ReadOnlySpan` wrapper pointing to this fixed data.
You can see this if you look at the generated IL:

```
  .method private hidebysig static specialname valuetype [System.Runtime]System.ReadOnlySpan`1<unsigned int8>
    get_DbTypeBytes() cil managed
  {
    .maxstack 8

    // [20 58 - 20 109]
    IL_0000: ldsflda      int64 '<PrivateImplementationDetails>'::A06A154BE3B860D0B56FA96C93523B732045BA0BCE2FFD4769109575CF1953BF
    IL_0005: ldc.i4.8
    IL_0006: newobj       instance void valuetype [System.Runtime]System.ReadOnlySpan`1<unsigned int8>::.ctor(void*, int32)
    IL_000b: ret

  } // end of method SqlTags::get_DbTypeBytes
```

However, in .NET Framework, even though we have vendored
`ReadOnlySpan<T>` so we can get some of the benefits (mostly cleaner
code), we _don't_ get these benefits. Which means that the above code
_does_ generate a new array with every invocation:

```
  .method private hidebysig static specialname valuetype Datadog.Trace.VendoredMicrosoftCode.System.ReadOnlySpan`1<unsigned int8>
    get_DbTypeBytes() cil managed
  {
    .maxstack 8

    // [20 58 - 20 109]
    IL_0000: ldc.i4.8
    IL_0001: newarr       [netstandard]System.Byte
    IL_0006: dup
    IL_0007: ldtoken      field int64 '<PrivateImplementationDetails>'::A06A154BE3B860D0B56FA96C93523B732045BA0BCE2FFD4769109575CF1953BF
    IL_000c: call         void [netstandard]System.Runtime.CompilerServices.RuntimeHelpers::InitializeArray(class [netstandard]System.Array, valuetype [netstandard]System.RuntimeFieldHandle)
    IL_0011: call         valuetype Datadog.Trace.VendoredMicrosoftCode.System.ReadOnlySpan`1<!0/*unsigned int8*/> valuetype Datadog.Trace.VendoredMicrosoftCode.System.ReadOnlySpan`1<unsigned int8>::op_Implicit(!0/*unsigned int8*/[])
    IL_0016: ret

  } // end of method SqlTags::get_DbTypeBytes
```

This is... Bad 😅 And it explains the _significant_ serialization
overhead identified in #7882 for .NET Framework. I also confirmed this
applies to all <.NET Core 3.1 too (because we compile for .NET Standard)


| Method | Runtime | Mean | Allocated | Alloc Ratio |
| -------------------------- | -------------------- | -------: |
--------: | ----------: |
| WriteEnrichedTraces_Before | .NET 6.0 | 488.9 us | 110 B | 0.001 |
| WriteEnrichedTraces_Before | .NET Framework 4.7.2 | 703.3 us | 112537
B | 1.000 |
| | | | | |
| WriteEnrichedTraces_After | .NET 6.0 | 469.1 us | 105 B | 0.50 |
| WriteEnrichedTraces_After | .NET Framework 4.7.2 | 703.4 us | 208 B |
1.00 |

## Implementation details

The fix is to just do what we were doing before #5298 introduced this
regression 😄 i.e. generate code like this:

```csharp
#if NETCOREAPP
    private static ReadOnlySpan<byte> DbTypeBytes => new byte[] { 167, 100, 98, 46, 116, 121, 112, 101 };
#else
    private static readonly byte[] DbTypeBytes = new byte[] { 167, 100, 98, 46, 116, 121, 112, 101 
#endif
```

## Test coverage

This is all covered by existing tests, and the new benchmark shows the
improvment

## Other details

I found a couple of other places in the vendored code that has the same
issue, and fixed them directly in the code. However, this is not ideal,
as if we re-vendor, we'll clobber these updates, so we'll need to update
the vendoring code too
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type:performance Performance, speed, latency, resource usage (CPU, memory)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants