Skip to content

Improve execution benchmark comment post#7570

Merged
andrewlock merged 7 commits intomasterfrom
andrew/fix-execution-bench-reports
Nov 3, 2025
Merged

Improve execution benchmark comment post#7570
andrewlock merged 7 commits intomasterfrom
andrew/fix-execution-bench-reports

Conversation

@andrewlock
Copy link
Member

@andrewlock andrewlock commented Sep 26, 2025

Summary of changes

  • Improve the PR comment report so it doesn't take as much real-estate
  • Fix the graphs that were broken recently.

Reason for change

The charts currently look a bit borked, and taking up a load of space, without being useful.

Implementation details

Through the issue to claude code and iterated a bit. Overall

  • Fixed the charts (it was issues with the dateFormat and axisFormat, looks like something changed on GitHub side to cause this
  • Hide all the charts by default (so they take up less space). Had to put each one in a separate <details> element, otherwise GitHub's rendering doesn't work properly
  • Produce some comparison tables for showing more data than before (basically more details)

I looked at the code somewhat, but ultimately it looks good enough to me I think. It's a bit verbose, but this is very non-critical stuff 😅

Test coverage

This is the test really 🤷‍♂️ Instead of something that looks like this, with 8 graphs taking up space:

image

The default view is more like this:

image

With more details if you expand the table:

image

And the charts are working again (and somewhat simplified)

image

Other details

https://datadoghq.atlassian.net/browse/LANGPLAT-817
https://datadoghq.atlassian.net/browse/LANGPLAT-893

@andrewlock andrewlock added the area:builds project files, build scripts, pipelines, versioning, releases, packages label Sep 26, 2025
@dd-trace-dotnet-ci-bot
Copy link

dd-trace-dotnet-ci-bot bot commented Sep 26, 2025

Execution-Time Benchmarks Report ⏱️

Execution-time results for samples comparing This PR (7570) and master.

✅ No regressions detected - check the details below

Full Metrics Comparison

FakeDbCommand

Metric Master (Mean ± 95% CI) Current (Mean ± 95% CI) Change Status
.NET Framework 4.8 - Baseline
duration68.07 ± (68.07 - 68.35) ms67.93 ± (67.91 - 68.11) ms-0.2%
.NET Framework 4.8 - Bailout
duration71.67 ± (71.60 - 71.81) ms71.67 ± (71.65 - 71.91) ms-0.0%
.NET Framework 4.8 - CallTarget+Inlining+NGEN
duration1043.41 ± (1042.89 - 1048.73) ms1046.83 ± (1048.57 - 1056.14) ms+0.3%✅⬆️
.NET Core 3.1 - Baseline
process.internal_duration_ms21.95 ± (21.91 - 21.98) ms21.98 ± (21.95 - 22.01) ms+0.2%✅⬆️
process.time_to_main_ms78.81 ± (78.63 - 78.99) ms78.76 ± (78.60 - 78.92) ms-0.1%
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed10.90 ± (10.89 - 10.90) MB10.88 ± (10.87 - 10.88) MB-0.2%
runtime.dotnet.threads.count12 ± (12 - 12)12 ± (12 - 12)+0.0%
.NET Core 3.1 - Bailout
process.internal_duration_ms21.90 ± (21.86 - 21.93) ms21.90 ± (21.87 - 21.94) ms+0.0%✅⬆️
process.time_to_main_ms79.93 ± (79.81 - 80.04) ms79.88 ± (79.78 - 79.99) ms-0.1%
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed10.93 ± (10.93 - 10.93) MB10.92 ± (10.91 - 10.92) MB-0.1%
runtime.dotnet.threads.count13 ± (13 - 13)13 ± (13 - 13)+0.0%
.NET Core 3.1 - CallTarget+Inlining+NGEN
process.internal_duration_ms208.45 ± (207.17 - 209.74) ms210.09 ± (208.90 - 211.27) ms+0.8%✅⬆️
process.time_to_main_ms513.57 ± (512.94 - 514.20) ms515.38 ± (514.81 - 515.95) ms+0.4%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed52.75 ± (52.73 - 52.77) MB52.66 ± (52.64 - 52.68) MB-0.2%
runtime.dotnet.threads.count28 ± (28 - 28)28 ± (28 - 28)+0.0%
.NET 6 - Baseline
process.internal_duration_ms20.95 ± (20.93 - 20.97) ms20.62 ± (20.60 - 20.64) ms-1.6%
process.time_to_main_ms68.36 ± (68.22 - 68.51) ms68.22 ± (68.08 - 68.36) ms-0.2%
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed10.59 ± (10.58 - 10.59) MB10.62 ± (10.61 - 10.62) MB+0.3%✅⬆️
runtime.dotnet.threads.count10 ± (10 - 10)10 ± (10 - 10)+0.0%
.NET 6 - Bailout
process.internal_duration_ms20.85 ± (20.82 - 20.88) ms20.65 ± (20.62 - 20.67) ms-1.0%
process.time_to_main_ms68.99 ± (68.94 - 69.04) ms68.91 ± (68.85 - 68.97) ms-0.1%
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed10.64 ± (10.63 - 10.64) MB10.65 ± (10.65 - 10.66) MB+0.2%✅⬆️
runtime.dotnet.threads.count11 ± (11 - 11)11 ± (11 - 11)+0.0%
.NET 6 - CallTarget+Inlining+NGEN
process.internal_duration_ms196.41 ± (195.39 - 197.43) ms198.82 ± (196.52 - 201.13) ms+1.2%✅⬆️
process.time_to_main_ms482.63 ± (482.01 - 483.24) ms484.03 ± (483.36 - 484.69) ms+0.3%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed51.46 ± (51.39 - 51.53) MB51.43 ± (51.35 - 51.50) MB-0.1%
runtime.dotnet.threads.count28 ± (28 - 28)28 ± (28 - 28)+0.0%✅⬆️
.NET 8 - Baseline
process.internal_duration_ms18.95 ± (18.92 - 18.98) ms18.94 ± (18.91 - 18.96) ms-0.1%
process.time_to_main_ms67.24 ± (67.12 - 67.36) ms67.30 ± (67.20 - 67.40) ms+0.1%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed7.65 ± (7.64 - 7.66) MB7.65 ± (7.64 - 7.66) MB+0.0%✅⬆️
runtime.dotnet.threads.count10 ± (10 - 10)10 ± (10 - 10)+0.0%
.NET 8 - Bailout
process.internal_duration_ms18.95 ± (18.93 - 18.98) ms18.86 ± (18.82 - 18.89) ms-0.5%
process.time_to_main_ms68.28 ± (68.21 - 68.35) ms68.42 ± (68.30 - 68.54) ms+0.2%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed7.66 ± (7.66 - 7.67) MB7.69 ± (7.68 - 7.69) MB+0.3%✅⬆️
runtime.dotnet.threads.count11 ± (11 - 11)11 ± (11 - 11)+0.0%
.NET 8 - CallTarget+Inlining+NGEN
process.internal_duration_ms177.28 ± (176.40 - 178.16) ms177.23 ± (176.24 - 178.22) ms-0.0%
process.time_to_main_ms460.64 ± (460.07 - 461.21) ms457.56 ± (456.84 - 458.29) ms-0.7%
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed38.90 ± (38.87 - 38.94) MB38.70 ± (38.67 - 38.74) MB-0.5%
runtime.dotnet.threads.count27 ± (27 - 27)27 ± (27 - 27)+0.1%✅⬆️

HttpMessageHandler

Metric Master (Mean ± 95% CI) Current (Mean ± 95% CI) Change Status
.NET Framework 4.8 - Baseline
duration191.17 ± (191.09 - 191.82) ms190.12 ± (190.23 - 190.85) ms-0.6%
.NET Framework 4.8 - Bailout
duration194.81 ± (194.58 - 195.20) ms194.95 ± (195.12 - 195.98) ms+0.1%✅⬆️
.NET Framework 4.8 - CallTarget+Inlining+NGEN
duration1148.28 ± (1153.76 - 1162.26) ms1151.92 ± (1155.50 - 1164.52) ms+0.3%✅⬆️
.NET Core 3.1 - Baseline
process.internal_duration_ms186.88 ± (186.46 - 187.30) ms185.74 ± (185.41 - 186.07) ms-0.6%
process.time_to_main_ms80.32 ± (80.12 - 80.52) ms79.91 ± (79.70 - 80.12) ms-0.5%
runtime.dotnet.exceptions.count3 ± (3 - 3)3 ± (3 - 3)+0.0%
runtime.dotnet.mem.committed16.11 ± (16.08 - 16.14) MB16.08 ± (16.05 - 16.11) MB-0.2%
runtime.dotnet.threads.count20 ± (20 - 20)20 ± (20 - 20)-0.2%
.NET Core 3.1 - Bailout
process.internal_duration_ms186.90 ± (186.51 - 187.30) ms185.37 ± (185.03 - 185.72) ms-0.8%
process.time_to_main_ms81.76 ± (81.57 - 81.94) ms81.09 ± (80.93 - 81.24) ms-0.8%
runtime.dotnet.exceptions.count3 ± (3 - 3)3 ± (3 - 3)+0.0%
runtime.dotnet.mem.committed16.15 ± (16.13 - 16.18) MB16.11 ± (16.08 - 16.14) MB-0.3%
runtime.dotnet.threads.count21 ± (21 - 21)21 ± (21 - 21)-0.8%
.NET Core 3.1 - CallTarget+Inlining+NGEN
process.internal_duration_ms395.55 ± (392.87 - 398.23) ms399.06 ± (396.21 - 401.91) ms+0.9%✅⬆️
process.time_to_main_ms515.66 ± (515.08 - 516.23) ms515.55 ± (514.89 - 516.20) ms-0.0%
runtime.dotnet.exceptions.count3 ± (3 - 3)3 ± (3 - 3)+0.0%
runtime.dotnet.mem.committed63.08 ± (62.93 - 63.23) MB63.28 ± (63.14 - 63.42) MB+0.3%✅⬆️
runtime.dotnet.threads.count29 ± (29 - 29)29 ± (29 - 29)-0.1%
.NET 6 - Baseline
process.internal_duration_ms190.11 ± (189.76 - 190.46) ms189.06 ± (188.79 - 189.33) ms-0.6%
process.time_to_main_ms69.38 ± (69.26 - 69.51) ms68.82 ± (68.66 - 68.98) ms-0.8%
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed16.07 ± (15.93 - 16.21) MB15.55 ± (15.37 - 15.73) MB-3.2%
runtime.dotnet.threads.count18 ± (18 - 18)17 ± (17 - 18)-3.6%
.NET 6 - Bailout
process.internal_duration_ms189.51 ± (189.29 - 189.73) ms189.10 ± (188.82 - 189.38) ms-0.2%
process.time_to_main_ms70.26 ± (70.17 - 70.35) ms69.74 ± (69.63 - 69.85) ms-0.7%
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed16.00 ± (15.85 - 16.15) MB15.71 ± (15.54 - 15.89) MB-1.8%
runtime.dotnet.threads.count19 ± (19 - 19)19 ± (18 - 19)-1.4%
.NET 6 - CallTarget+Inlining+NGEN
process.internal_duration_ms411.85 ± (408.77 - 414.93) ms409.21 ± (406.19 - 412.24) ms-0.6%
process.time_to_main_ms486.52 ± (485.86 - 487.17) ms484.66 ± (484.08 - 485.24) ms-0.4%
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed62.15 ± (62.00 - 62.29) MB61.80 ± (61.65 - 61.94) MB-0.6%
runtime.dotnet.threads.count29 ± (29 - 29)30 ± (29 - 30)+0.2%✅⬆️
.NET 8 - Baseline
process.internal_duration_ms188.18 ± (187.92 - 188.44) ms187.70 ± (187.45 - 187.96) ms-0.3%
process.time_to_main_ms69.06 ± (68.87 - 69.24) ms68.27 ± (68.14 - 68.41) ms-1.1%
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed11.75 ± (11.68 - 11.82) MB11.67 ± (11.59 - 11.76) MB-0.7%
runtime.dotnet.threads.count18 ± (18 - 18)18 ± (17 - 18)-2.1%
.NET 8 - Bailout
process.internal_duration_ms187.39 ± (187.17 - 187.60) ms187.47 ± (187.24 - 187.70) ms+0.0%✅⬆️
process.time_to_main_ms69.85 ± (69.77 - 69.94) ms69.50 ± (69.42 - 69.58) ms-0.5%
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed11.84 ± (11.81 - 11.87) MB11.69 ± (11.60 - 11.78) MB-1.3%
runtime.dotnet.threads.count19 ± (19 - 19)19 ± (18 - 19)-2.9%
.NET 8 - CallTarget+Inlining+NGEN
process.internal_duration_ms356.28 ± (354.69 - 357.86) ms354.76 ± (353.30 - 356.21) ms-0.4%
process.time_to_main_ms462.25 ± (461.67 - 462.83) ms457.44 ± (456.75 - 458.13) ms-1.0%
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed50.49 ± (50.44 - 50.54) MB50.40 ± (50.37 - 50.44) MB-0.2%
runtime.dotnet.threads.count29 ± (29 - 29)29 ± (29 - 29)+0.1%✅⬆️
Comparison explanation

Execution-time benchmarks measure the whole time it takes to execute a program, and are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are highlighted in **red**. The following thresholds were used for comparing the execution times:

  • Welch test with statistical test for significance of 5%
  • Only results indicating a difference greater than 5% and 5 ms are considered.

Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard.

Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph).

Duration charts
FakeDbCommand (.NET Framework 4.8)
gantt
    title Execution time (ms) FakeDbCommand (.NET Framework 4.8)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (7570) - mean (68ms)  : 67, 69
    master - mean (68ms)  : 67, 70

    section Bailout
    This PR (7570) - mean (72ms)  : 70, 73
    master - mean (72ms)  : 71, 73

    section CallTarget+Inlining+NGEN
    This PR (7570) - mean (1,052ms)  : 998, 1106
    master - mean (1,046ms)  : 1004, 1087

Loading
FakeDbCommand (.NET Core 3.1)
gantt
    title Execution time (ms) FakeDbCommand (.NET Core 3.1)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (7570) - mean (106ms)  : 104, 109
    master - mean (106ms)  : 104, 108

    section Bailout
    This PR (7570) - mean (107ms)  : 105, 108
    master - mean (107ms)  : 105, 109

    section CallTarget+Inlining+NGEN
    This PR (7570) - mean (751ms)  : 723, 780
    master - mean (748ms)  : 725, 771

Loading
FakeDbCommand (.NET 6)
gantt
    title Execution time (ms) FakeDbCommand (.NET 6)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (7570) - mean (94ms)  : 92, 96
    master - mean (94ms)  : 92, 96

    section Bailout
    This PR (7570) - mean (94ms)  : 93, 95
    master - mean (95ms)  : 93, 96

    section CallTarget+Inlining+NGEN
    This PR (7570) - mean (710ms)  : 671, 750
    master - mean (708ms)  : 679, 738

Loading
FakeDbCommand (.NET 8)
gantt
    title Execution time (ms) FakeDbCommand (.NET 8)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (7570) - mean (92ms)  : 90, 95
    master - mean (92ms)  : 90, 95

    section Bailout
    This PR (7570) - mean (94ms)  : 90, 97
    master - mean (93ms)  : 92, 94

    section CallTarget+Inlining+NGEN
    This PR (7570) - mean (664ms)  : 647, 680
    master - mean (666ms)  : 652, 680

Loading
HttpMessageHandler (.NET Framework 4.8)
gantt
    title Execution time (ms) HttpMessageHandler (.NET Framework 4.8)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (7570) - mean (191ms)  : 188, 193
    master - mean (191ms)  : 188, 195

    section Bailout
    This PR (7570) - mean (196ms)  : 191, 200
    master - mean (195ms)  : 192, 198

    section CallTarget+Inlining+NGEN
    This PR (7570) - mean (1,160ms)  : 1094, 1226
    master - mean (1,158ms)  : 1098, 1218

Loading
HttpMessageHandler (.NET Core 3.1)
gantt
    title Execution time (ms) HttpMessageHandler (.NET Core 3.1)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (7570) - mean (274ms)  : 268, 280
    master - mean (276ms)  : 268, 283

    section Bailout
    This PR (7570) - mean (275ms)  : 271, 278
    master - mean (277ms)  : 272, 282

    section CallTarget+Inlining+NGEN
    This PR (7570) - mean (946ms)  : 904, 988
    master - mean (954ms)  : 906, 1002

Loading
HttpMessageHandler (.NET 6)
gantt
    title Execution time (ms) HttpMessageHandler (.NET 6)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (7570) - mean (266ms)  : 262, 270
    master - mean (268ms)  : 262, 273

    section Bailout
    This PR (7570) - mean (267ms)  : 264, 270
    master - mean (268ms)  : 265, 271

    section CallTarget+Inlining+NGEN
    This PR (7570) - mean (923ms)  : 879, 967
    master - mean (936ms)  : 881, 991

Loading
HttpMessageHandler (.NET 8)
gantt
    title Execution time (ms) HttpMessageHandler (.NET 8)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (7570) - mean (265ms)  : 260, 270
    master - mean (267ms)  : 263, 271

    section Bailout
    This PR (7570) - mean (266ms)  : 263, 269
    master - mean (267ms)  : 264, 270

    section CallTarget+Inlining+NGEN
    This PR (7570) - mean (844ms)  : 827, 861
    master - mean (850ms)  : 835, 866

Loading

@andrewlock andrewlock marked this pull request as ready for review October 30, 2025 13:29
@andrewlock andrewlock requested a review from a team as a code owner October 30, 2025 13:29
Copy link
Collaborator

@bouwkast bouwkast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't really look at the code but the new comment looks good to me unfortunate that it has so many regressions still as it is noisy but not as noisy as before :)

@andrewlock
Copy link
Member Author

Didn't really look at the code but the new comment looks good to me unfortunate that it has so many regressions still as it is noisy but not as noisy as before :)

Yeah, it's really flagging a lot of regressions there @bouwkast and a lot of those "regressions" are in the baseline scenarios which, you know aren't regressions, that's just variability 😅 I'm going to double check the numbers in the tables are actually the values we think should be there by checking the source json, but then fix the regression calculation. Based on the graphs there's only actually a regression in the "bailout" scenarios here. That's obviously still noisy, but it's closer to correct...

As for why there's apparent regressions here... 🤷‍♂️ 😬

@andrewlock andrewlock force-pushed the andrew/fix-execution-bench-reports branch from 62518be to 55b37c1 Compare November 3, 2025 13:02
@datadog-official
Copy link

datadog-official bot commented Nov 3, 2025

⚠️ Tests

⚠️ Warnings

❄️ 2 New flaky tests detected

SubmitsOtlpMetrics from Datadog.Trace.ClrProfiler.IntegrationTests.OpenTelemetrySdkTests (Datadog)
Expected metricsData not to be empty.
MustHaveValidTagsForEveryMetric from Datadog.Trace.Tests.Telemetry.Metrics.TelemetryMetricExtensionsTests (Datadog)
Object reference not set to an instance of an object.

ℹ️ Info

🧪 All tests passed

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 55b37c1 | Docs | Datadog PR Page | Was this helpful? Give us feedback!

@andrewlock
Copy link
Member Author

OK, fixed the incorrect regression calculations @bouwkast, and compressed the format even further 🙂

@andrewlock andrewlock enabled auto-merge (squash) November 3, 2025 15:17
@andrewlock andrewlock merged commit 60c7bff into master Nov 3, 2025
63 of 64 checks passed
@andrewlock andrewlock deleted the andrew/fix-execution-bench-reports branch November 3, 2025 15:59
@github-actions github-actions bot added this to the vNext-v3 milestone Nov 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:builds project files, build scripts, pipelines, versioning, releases, packages

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants