Skip to content

[Feature Flags] Add flag evaluation metrics via OpenFeature hook#8367

Merged
sameerank merged 36 commits into
masterfrom
sameerank/FFL-1946/add-flag-eval-metrics
May 8, 2026
Merged

[Feature Flags] Add flag evaluation metrics via OpenFeature hook#8367
sameerank merged 36 commits into
masterfrom
sameerank/FFL-1946/add-flag-eval-metrics

Conversation

@sameerank

@sameerank sameerank commented Mar 25, 2026

Copy link
Copy Markdown
Contributor

Summary of changes

Adds flag evaluation metrics support to the Datadog OpenFeature provider. When a feature flag is evaluated, a counter metric is emitted with relevant attributes.

Reason for change

Part of the Feature Flags Evaluation (FFE) initiative to provide visibility into flag evaluations across all Datadog server SDKs. This brings parity with implementations in dd-trace-py and dd-trace-go.

Implementation details

  • FlagEvalMetrics.cs: Core metrics recording class using System.Diagnostics.Metrics

    • Meter name: Datadog.FeatureFlags.OpenFeature
    • Metric: feature_flag.evaluations (Counter)
    • Unit: {evaluation}
    • Tags: feature_flag.key, feature_flag.result.variant, feature_flag.result.reason, error.type (optional), feature_flag.result.allocation_key (optional)
  • FlagEvalMetricsHook.cs: OpenFeature hook that implements FinallyAsync to record metrics after each evaluation completes

    • Converts reason to lowercase for consistency
    • Maps ErrorType enum to snake_case strings
    • Extracts allocation key from flag metadata when present
  • DatadogProvider.cs: Registers the metrics hook with the OpenFeature API

  • Updated OpenFeature SDK dependency from 2.0.0 to 2.3.0

  • Package version bumped to 2.3.0 to match OpenFeature SDK dependency version. This signals the breaking changes from OpenFeature 2.3.0:

    • .NET 6 support dropped (now requires .NET 8+)
    • Hook finally signature changed to include evaluation details

Conditional compilation: Only enabled for .NET 6+ (#if NET6_0_OR_GREATER) since System.Diagnostics.Metrics requires .NET 6+.

Test coverage

  • Feature is validated through system-tests (tests/ffe/test_flag_eval_metrics.py)
  • Also checked in https://github.com/DataDog/ffe-dogfooding/pull/56
  • Unit tests were explored but deferred due to type conflicts between shared source files in Datadog.FeatureFlags.OpenFeature and Datadog.Trace assemblies

Other details

Jira: https://datadoghq.atlassian.net/browse/FFL-1946

Related PRs:

@sameerank sameerank added the docker_image_artifacts Use to label PRs for which you would need a Docker Image created for. label Mar 25, 2026
@sameerank sameerank force-pushed the sameerank/FFL-1946/add-flag-eval-metrics branch from 78be069 to f240989 Compare March 25, 2026 05:55
@dd-trace-dotnet-ci-bot

dd-trace-dotnet-ci-bot Bot commented Mar 25, 2026

Copy link
Copy Markdown

Execution-Time Benchmarks Report ⏱️

Execution-time results for samples comparing This PR (8367) and master.

✅ No regressions detected - check the details below

Full Metrics Comparison

FakeDbCommand

Metric Master (Mean ± 95% CI) Current (Mean ± 95% CI) Change Status
.NET Framework 4.8 - Baseline
duration72.93 ± (72.86 - 73.22) ms72.75 ± (72.79 - 73.18) ms-0.2%
.NET Framework 4.8 - Bailout
duration77.49 ± (77.30 - 77.66) ms78.21 ± (78.17 - 78.66) ms+0.9%✅⬆️
.NET Framework 4.8 - CallTarget+Inlining+NGEN
duration1133.33 ± (1131.91 - 1138.31) ms1124.60 ± (1122.99 - 1130.28) ms-0.8%
.NET Core 3.1 - Baseline
process.internal_duration_ms22.42 ± (22.38 - 22.45) ms22.74 ± (22.68 - 22.80) ms+1.4%✅⬆️
process.time_to_main_ms84.68 ± (84.48 - 84.89) ms86.44 ± (86.14 - 86.73) ms+2.1%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed10.92 ± (10.92 - 10.92) MB10.92 ± (10.91 - 10.92) MB-0.0%
runtime.dotnet.threads.count12 ± (12 - 12)12 ± (12 - 12)+0.0%
.NET Core 3.1 - Bailout
process.internal_duration_ms22.75 ± (22.69 - 22.80) ms22.49 ± (22.45 - 22.53) ms-1.1%
process.time_to_main_ms88.83 ± (88.52 - 89.14) ms87.13 ± (86.85 - 87.41) ms-1.9%
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed10.95 ± (10.95 - 10.96) MB10.95 ± (10.95 - 10.96) MB+0.0%✅⬆️
runtime.dotnet.threads.count13 ± (13 - 13)13 ± (13 - 13)+0.0%
.NET Core 3.1 - CallTarget+Inlining+NGEN
process.internal_duration_ms205.21 ± (204.64 - 205.78) ms205.85 ± (205.35 - 206.34) ms+0.3%✅⬆️
process.time_to_main_ms567.05 ± (565.72 - 568.38) ms570.77 ± (569.32 - 572.21) ms+0.7%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed49.70 ± (49.66 - 49.74) MB49.80 ± (49.77 - 49.83) MB+0.2%✅⬆️
runtime.dotnet.threads.count28 ± (28 - 28)28 ± (28 - 28)+0.1%✅⬆️
.NET 6 - Baseline
process.internal_duration_ms21.51 ± (21.45 - 21.56) ms21.30 ± (21.25 - 21.36) ms-1.0%
process.time_to_main_ms75.84 ± (75.55 - 76.14) ms74.64 ± (74.36 - 74.91) ms-1.6%
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed10.60 ± (10.60 - 10.60) MB10.63 ± (10.62 - 10.63) MB+0.2%✅⬆️
runtime.dotnet.threads.count10 ± (10 - 10)10 ± (10 - 10)+0.0%
.NET 6 - Bailout
process.internal_duration_ms21.06 ± (21.02 - 21.09) ms21.26 ± (21.21 - 21.32) ms+1.0%✅⬆️
process.time_to_main_ms74.52 ± (74.33 - 74.71) ms76.05 ± (75.80 - 76.29) ms+2.0%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed10.73 ± (10.73 - 10.73) MB10.75 ± (10.74 - 10.75) MB+0.2%✅⬆️
runtime.dotnet.threads.count11 ± (11 - 11)11 ± (11 - 11)+0.0%
.NET 6 - CallTarget+Inlining+NGEN
process.internal_duration_ms353.55 ± (351.23 - 355.87) ms352.79 ± (350.62 - 354.96) ms-0.2%
process.time_to_main_ms564.14 ± (562.72 - 565.57) ms562.28 ± (560.82 - 563.75) ms-0.3%
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed51.03 ± (51.00 - 51.05) MB51.18 ± (51.15 - 51.20) MB+0.3%✅⬆️
runtime.dotnet.threads.count28 ± (28 - 28)28 ± (28 - 28)-0.1%
.NET 8 - Baseline
process.internal_duration_ms19.82 ± (19.77 - 19.87) ms19.60 ± (19.56 - 19.64) ms-1.1%
process.time_to_main_ms75.09 ± (74.80 - 75.38) ms73.45 ± (73.26 - 73.63) ms-2.2%
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed7.67 ± (7.66 - 7.67) MB7.65 ± (7.65 - 7.66) MB-0.2%
runtime.dotnet.threads.count10 ± (10 - 10)10 ± (10 - 10)+0.0%
.NET 8 - Bailout
process.internal_duration_ms19.91 ± (19.84 - 19.97) ms20.08 ± (20.02 - 20.14) ms+0.9%✅⬆️
process.time_to_main_ms76.21 ± (75.97 - 76.44) ms77.23 ± (76.93 - 77.54) ms+1.3%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed7.73 ± (7.72 - 7.73) MB7.70 ± (7.69 - 7.70) MB-0.4%
runtime.dotnet.threads.count11 ± (11 - 11)11 ± (11 - 11)+0.0%
.NET 8 - CallTarget+Inlining+NGEN
process.internal_duration_ms280.94 ± (278.22 - 283.66) ms288.66 ± (284.27 - 293.04) ms+2.7%✅⬆️
process.time_to_main_ms522.14 ± (520.87 - 523.42) ms522.29 ± (521.01 - 523.57) ms+0.0%✅⬆️
runtime.dotnet.exceptions.count0 ± (0 - 0)0 ± (0 - 0)+0.0%
runtime.dotnet.mem.committed38.00 ± (37.97 - 38.03) MB37.91 ± (37.88 - 37.94) MB-0.2%
runtime.dotnet.threads.count27 ± (27 - 27)27 ± (27 - 28)+0.2%✅⬆️

HttpMessageHandler

Metric Master (Mean ± 95% CI) Current (Mean ± 95% CI) Change Status
.NET Framework 4.8 - Baseline
duration210.62 ± (212.96 - 216.34) ms203.21 ± (203.60 - 204.96) ms-3.5%
.NET Framework 4.8 - Bailout
duration213.46 ± (214.73 - 217.22) ms208.71 ± (209.12 - 210.42) ms-2.2%
.NET Framework 4.8 - CallTarget+Inlining+NGEN
duration1318.96 ± (1317.47 - 1327.34) ms1271.81 ± (1269.11 - 1275.33) ms-3.6%
.NET Core 3.1 - Baseline
process.internal_duration_ms204.83 ± (203.72 - 205.93) ms198.24 ± (197.47 - 199.01) ms-3.2%
process.time_to_main_ms89.69 ± (89.12 - 90.27) ms85.84 ± (85.51 - 86.17) ms-4.3%
runtime.dotnet.exceptions.count3 ± (3 - 3)3 ± (3 - 3)+0.0%
runtime.dotnet.mem.committed15.94 ± (15.92 - 15.95) MB15.95 ± (15.93 - 15.97) MB+0.1%✅⬆️
runtime.dotnet.threads.count20 ± (20 - 20)20 ± (20 - 20)-1.3%
.NET Core 3.1 - Bailout
process.internal_duration_ms206.13 ± (204.95 - 207.31) ms200.87 ± (200.13 - 201.60) ms-2.6%
process.time_to_main_ms91.03 ± (90.49 - 91.56) ms88.96 ± (88.61 - 89.30) ms-2.3%
runtime.dotnet.exceptions.count3 ± (3 - 3)3 ± (3 - 3)+0.0%
runtime.dotnet.mem.committed15.92 ± (15.90 - 15.94) MB16.00 ± (15.98 - 16.02) MB+0.5%✅⬆️
runtime.dotnet.threads.count21 ± (21 - 21)21 ± (21 - 21)+0.1%✅⬆️
.NET Core 3.1 - CallTarget+Inlining+NGEN
process.internal_duration_ms394.88 ± (393.22 - 396.55) ms390.14 ± (388.67 - 391.60) ms-1.2%
process.time_to_main_ms600.86 ± (598.44 - 603.27) ms586.29 ± (584.61 - 587.97) ms-2.4%
runtime.dotnet.exceptions.count3 ± (3 - 3)3 ± (3 - 3)+0.0%
runtime.dotnet.mem.committed59.57 ± (59.42 - 59.71) MB59.04 ± (58.87 - 59.21) MB-0.9%
runtime.dotnet.threads.count30 ± (30 - 30)30 ± (30 - 30)-0.2%
.NET 6 - Baseline
process.internal_duration_ms213.91 ± (212.73 - 215.09) ms206.90 ± (206.14 - 207.66) ms-3.3%
process.time_to_main_ms79.55 ± (79.05 - 80.05) ms76.73 ± (76.42 - 77.05) ms-3.5%
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed16.16 ± (16.13 - 16.18) MB16.23 ± (16.21 - 16.25) MB+0.4%✅⬆️
runtime.dotnet.threads.count19 ± (19 - 20)19 ± (19 - 20)-0.1%
.NET 6 - Bailout
process.internal_duration_ms211.96 ± (210.90 - 213.03) ms205.70 ± (205.04 - 206.36) ms-3.0%
process.time_to_main_ms80.18 ± (79.73 - 80.63) ms77.32 ± (77.04 - 77.60) ms-3.6%
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed16.15 ± (16.13 - 16.17) MB16.32 ± (16.30 - 16.34) MB+1.1%✅⬆️
runtime.dotnet.threads.count21 ± (20 - 21)20 ± (20 - 21)-0.0%
.NET 6 - CallTarget+Inlining+NGEN
process.internal_duration_ms573.90 ± (569.70 - 578.10) ms572.82 ± (570.41 - 575.24) ms-0.2%
process.time_to_main_ms595.19 ± (592.96 - 597.43) ms583.31 ± (581.88 - 584.74) ms-2.0%
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed62.03 ± (61.97 - 62.10) MB61.94 ± (61.88 - 62.00) MB-0.1%
runtime.dotnet.threads.count31 ± (31 - 31)31 ± (31 - 31)-0.4%
.NET 8 - Baseline
process.internal_duration_ms212.35 ± (211.05 - 213.65) ms204.82 ± (204.13 - 205.51) ms-3.5%
process.time_to_main_ms78.27 ± (77.75 - 78.79) ms75.02 ± (74.72 - 75.31) ms-4.2%
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed11.45 ± (11.43 - 11.47) MB11.57 ± (11.55 - 11.59) MB+1.1%✅⬆️
runtime.dotnet.threads.count19 ± (19 - 19)19 ± (19 - 19)-1.5%
.NET 8 - Bailout
process.internal_duration_ms211.80 ± (210.48 - 213.12) ms204.33 ± (203.57 - 205.10) ms-3.5%
process.time_to_main_ms79.25 ± (78.78 - 79.72) ms76.51 ± (76.22 - 76.81) ms-3.5%
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed11.50 ± (11.48 - 11.51) MB11.65 ± (11.63 - 11.66) MB+1.3%✅⬆️
runtime.dotnet.threads.count20 ± (20 - 20)20 ± (20 - 20)-1.1%
.NET 8 - CallTarget+Inlining+NGEN
process.internal_duration_ms535.43 ± (527.41 - 543.45) ms504.02 ± (498.09 - 509.95) ms-5.9%
process.time_to_main_ms555.36 ± (552.74 - 557.97) ms537.83 ± (536.52 - 539.15) ms-3.2%
runtime.dotnet.exceptions.count4 ± (4 - 4)4 ± (4 - 4)+0.0%
runtime.dotnet.mem.committed51.70 ± (51.64 - 51.77) MB51.56 ± (51.51 - 51.60) MB-0.3%
runtime.dotnet.threads.count30 ± (30 - 30)30 ± (30 - 30)-0.2%
Comparison explanation

Execution-time benchmarks measure the whole time it takes to execute a program, and are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are highlighted in **red**. The following thresholds were used for comparing the execution times:

  • Welch test with statistical test for significance of 5%
  • Only results indicating a difference greater than 5% and 5 ms are considered.

Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard.

Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph).

Duration charts
FakeDbCommand (.NET Framework 4.8)
gantt
    title Execution time (ms) FakeDbCommand (.NET Framework 4.8)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (8367) - mean (73ms)  : 70, 76
    master - mean (73ms)  : 70, 76

    section Bailout
    This PR (8367) - mean (78ms)  : 75, 82
    master - mean (77ms)  : 75, 80

    section CallTarget+Inlining+NGEN
    This PR (8367) - mean (1,127ms)  : 1072, 1181
    master - mean (1,135ms)  : 1090, 1180

Loading
FakeDbCommand (.NET Core 3.1)
gantt
    title Execution time (ms) FakeDbCommand (.NET Core 3.1)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (8367) - mean (116ms)  : 111, 122
    master - mean (114ms)  : 109, 119

    section Bailout
    This PR (8367) - mean (117ms)  : 111, 123
    master - mean (119ms)  : 113, 125

    section CallTarget+Inlining+NGEN
    This PR (8367) - mean (815ms)  : 782, 847
    master - mean (812ms)  : 791, 834

Loading
FakeDbCommand (.NET 6)
gantt
    title Execution time (ms) FakeDbCommand (.NET 6)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (8367) - mean (103ms)  : 98, 107
    master - mean (104ms)  : 98, 110

    section Bailout
    This PR (8367) - mean (104ms)  : 99, 109
    master - mean (102ms)  : 99, 105

    section CallTarget+Inlining+NGEN
    This PR (8367) - mean (948ms)  : 905, 991
    master - mean (950ms)  : 902, 998

Loading
FakeDbCommand (.NET 8)
gantt
    title Execution time (ms) FakeDbCommand (.NET 8)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (8367) - mean (101ms)  : 97, 104
    master - mean (103ms)  : 97, 109

    section Bailout
    This PR (8367) - mean (106ms)  : 99, 112
    master - mean (104ms)  : 99, 110

    section CallTarget+Inlining+NGEN
    This PR (8367) - mean (838ms)  : 774, 901
    master - mean (834ms)  : 786, 881

Loading
HttpMessageHandler (.NET Framework 4.8)
gantt
    title Execution time (ms) HttpMessageHandler (.NET Framework 4.8)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (8367) - mean (204ms)  : 195, 214
    master - mean (215ms)  : 189, 240

    section Bailout
    This PR (8367) - mean (210ms)  : 200, 219
    master - mean (216ms)  : 198, 234

    section CallTarget+Inlining+NGEN
    This PR (8367) - mean (1,272ms)  : 1228, 1316
    master - mean (1,322ms)  : 1267, 1378

Loading
HttpMessageHandler (.NET Core 3.1)
gantt
    title Execution time (ms) HttpMessageHandler (.NET Core 3.1)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (8367) - mean (297ms)  : 275, 318
    master - mean (308ms)  : 277, 339

    section Bailout
    This PR (8367) - mean (301ms)  : 285, 317
    master - mean (309ms)  : 283, 335

    section CallTarget+Inlining+NGEN
    This PR (8367) - mean (1,018ms)  : 988, 1049
    master - mean (1,043ms)  : 1000, 1086

Loading
HttpMessageHandler (.NET 6)
gantt
    title Execution time (ms) HttpMessageHandler (.NET 6)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (8367) - mean (294ms)  : 278, 310
    master - mean (304ms)  : 279, 330

    section Bailout
    This PR (8367) - mean (295ms)  : 275, 315
    master - mean (304ms)  : 279, 330

    section CallTarget+Inlining+NGEN
    This PR (8367) - mean (1,188ms)  : 1137, 1239
    master - mean (1,207ms)  : 1129, 1285

Loading
HttpMessageHandler (.NET 8)
gantt
    title Execution time (ms) HttpMessageHandler (.NET 8)
    dateFormat  x
    axisFormat %Q
    todayMarker off
    section Baseline
    This PR (8367) - mean (292ms)  : 278, 306
    master - mean (305ms)  : 274, 335

    section Bailout
    This PR (8367) - mean (295ms)  : 272, 317
    master - mean (304ms)  : 276, 332

    section CallTarget+Inlining+NGEN
    This PR (8367) - mean (1,082ms)  : 985, 1179
    master - mean (1,130ms)  : 1024, 1236

Loading

@pr-commenter

pr-commenter Bot commented Mar 25, 2026

Copy link
Copy Markdown

Benchmarks

Benchmark execution time: 2026-05-08 01:42:34

Comparing candidate commit 2a37cf3 in PR branch sameerank/FFL-1946/add-flag-eval-metrics with baseline commit e2f79f8 in branch master.

Some scenarios are present only in baseline or only in candidate runs. If you didn't create or remove some scenarios in your branch, this maybe a sign of crashed benchmarks 💥💥💥
Check Gitlab CI job log to find if any benchmark has crashed.

Scenarios present only in baseline:

  • Benchmarks.OpenTelemetry.Api.Trace.ActivityBenchmark.StartSpan_UpdateName_Sampled net472
  • Benchmarks.OpenTelemetry.Api.Trace.TelemetrySpanBenchmark.StartSpan netcoreapp3.1
  • Benchmarks.OpenTelemetry.Api.Trace.ActivityBenchmark.StartSpan net6.0
  • Benchmarks.OpenTelemetry.Api.Trace.TelemetrySpanBenchmark.StartSpan net472
  • Benchmarks.OpenTelemetry.Api.Trace.ActivityBenchmark.StartSpan_AddEvent_Sampled net472
  • Benchmarks.OpenTelemetry.Api.Trace.TelemetrySpanBenchmark.StartSpan_AddEvent_Sampled net6.0
  • Benchmarks.OpenTelemetry.Api.Trace.TelemetrySpanBenchmark.StartSpan_RecordException_Sampled netcoreapp3.1
  • Benchmarks.OpenTelemetry.Api.Trace.ActivityBenchmark.StartSpan_AddEvent_Sampled net6.0
  • Benchmarks.OpenTelemetry.Api.Trace.TelemetrySpanBenchmark.StartSpan_UpdateName_Sampled net472
  • Benchmarks.OpenTelemetry.Api.Trace.ActivityBenchmark.StartSpan_GetContext_Sampled net6.0
  • Benchmarks.OpenTelemetry.Api.Trace.TelemetrySpanBenchmark.StartSpan_SetStatus_Sampled net472
  • Benchmarks.OpenTelemetry.Api.Trace.ActivityBenchmark.StartSpan_SetStatus_Sampled net6.0
  • Benchmarks.OpenTelemetry.Api.Trace.TelemetrySpanBenchmark.StartSpan_SetAttributes_Sampled net472
  • Benchmarks.OpenTelemetry.Api.Trace.ActivityBenchmark.StartSpan_GetContext_Sampled netcoreapp3.1
  • Benchmarks.OpenTelemetry.Api.Trace.ActivityBenchmark.StartSpan netcoreapp3.1
  • Benchmarks.OpenTelemetry.Api.Trace.TelemetrySpanBenchmark.StartSpan_SetStatus_Sampled netcoreapp3.1
  • Benchmarks.OpenTelemetry.Api.Trace.TelemetrySpanBenchmark.StartSpan_SetAttributes_Sampled net6.0
  • Benchmarks.OpenTelemetry.Api.Trace.TelemetrySpanBenchmark.StartSpan_RecordException_Sampled net6.0
  • Benchmarks.OpenTelemetry.Api.Trace.ActivityBenchmark.StartSpan_SetAttributes_Sampled net472
  • Benchmarks.OpenTelemetry.Api.Trace.ActivityBenchmark.StartSpan_SetStatus_Sampled netcoreapp3.1
  • Benchmarks.OpenTelemetry.Api.Trace.TelemetrySpanBenchmark.StartSpan_AddEvent_Sampled netcoreapp3.1
  • Benchmarks.OpenTelemetry.Api.Trace.TelemetrySpanBenchmark.StartSpan_SetAttributes_Sampled netcoreapp3.1
  • Benchmarks.OpenTelemetry.Api.Trace.TelemetrySpanBenchmark.StartSpan_SetStatus_Sampled net6.0
  • Benchmarks.OpenTelemetry.Api.Trace.TelemetrySpanBenchmark.StartSpan_GetContext_Sampled net6.0
  • Benchmarks.OpenTelemetry.Api.Trace.ActivityBenchmark.StartSpan_SetAttributes_Sampled netcoreapp3.1
  • Benchmarks.OpenTelemetry.Api.Trace.TelemetrySpanBenchmark.StartSpan_UpdateName_Sampled netcoreapp3.1
  • Benchmarks.OpenTelemetry.Api.Trace.ActivityBenchmark.StartSpan_SetStatus_Sampled net472
  • Benchmarks.OpenTelemetry.Api.Trace.ActivityBenchmark.StartSpan_GetContext_Sampled net472
  • Benchmarks.OpenTelemetry.Api.Trace.TelemetrySpanBenchmark.StartSpan_GetContext_Sampled net472
  • Benchmarks.OpenTelemetry.Api.Trace.ActivityBenchmark.StartSpan_AddEvent_Sampled netcoreapp3.1
  • Benchmarks.OpenTelemetry.Api.Trace.ActivityBenchmark.StartSpan_UpdateName_Sampled netcoreapp3.1
  • Benchmarks.OpenTelemetry.Api.Trace.TelemetrySpanBenchmark.StartSpan_AddEvent_Sampled net472
  • Benchmarks.OpenTelemetry.Api.Trace.TelemetrySpanBenchmark.StartSpan_UpdateName_Sampled net6.0
  • Benchmarks.OpenTelemetry.Api.Trace.ActivityBenchmark.StartSpan_SetAttributes_Sampled net6.0
  • Benchmarks.OpenTelemetry.Api.Trace.ActivityBenchmark.StartSpan net472
  • Benchmarks.OpenTelemetry.Api.Trace.TelemetrySpanBenchmark.StartSpan_GetContext_Sampled netcoreapp3.1
  • Benchmarks.OpenTelemetry.Api.Trace.TelemetrySpanBenchmark.StartSpan net6.0
  • Benchmarks.OpenTelemetry.Api.Trace.ActivityBenchmark.StartSpan_UpdateName_Sampled net6.0
  • Benchmarks.OpenTelemetry.Api.Trace.TelemetrySpanBenchmark.StartSpan_RecordException_Sampled net472

Found 3 performance improvements and 3 performance regressions! Performance is the same for 48 metrics, 18 unstable metrics, 87 known flaky benchmarks, 39 flaky benchmarks without significant changes.

Explanation

This is an A/B test comparing a candidate commit's performance against that of a baseline commit. Performance changes are noted in the tables below as:

  • 🟩 = significantly better candidate vs. baseline
  • 🟥 = significantly worse candidate vs. baseline

We compute a confidence interval (CI) over the relative difference of means between metrics from the candidate and baseline commits, considering the baseline as the reference.

If the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD), the change is considered significant.

Feel free to reach out to #apm-benchmarking-platform on Slack if you have any questions.

More details about the CI and significant changes

You can imagine this CI as a range of values that is likely to contain the true difference of means between the candidate and baseline commits.

CIs of the difference of means are often centered around 0%, because often changes are not that big:

---------------------------------(------|---^--------)-------------------------------->
                              -0.6%    0%  0.3%     +1.2%
                                 |          |        |
         lower bound of the CI --'          |        |
sample mean (center of the CI) -------------'        |
         upper bound of the CI ----------------------'

As described above, a change is considered significant if the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD).

For instance, for an execution time metric, this confidence interval indicates a significantly worse performance:

----------------------------------------|---------|---(---------^---------)---------->
                                       0%        1%  1.3%      2.2%      3.1%
                                                  |   |         |         |
       significant impact threshold --------------'   |         |         |
                      lower bound of CI --------------'         |         |
       sample mean (center of the CI) --------------------------'         |
                      upper bound of CI ----------------------------------'

scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TracerBenchmark.StartActiveSpan net6.0

  • 🟥 allocated_mem [+143 bytes; +144 bytes] or [+10.108%; +10.117%]

scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TracerBenchmark.StartRootSpan net6.0

  • 🟩 allocated_mem [-144 bytes; -143 bytes] or [-9.188%; -9.180%]

scenario:Benchmarks.Trace.DbCommandBenchmark.ExecuteNonQuery net6.0

  • 🟥 throughput [-70905.271op/s; -41014.848op/s] or [-13.453%; -7.782%]
  • 🟩 execution_time [-114.901ms; -108.817ms] or [-57.590%; -54.540%]

scenario:Benchmarks.Trace.HttpClientBenchmark.SendAsync net6.0

  • 🟥 throughput [-62717.339op/s; -56835.730op/s] or [-42.605%; -38.609%]
  • 🟩 execution_time [-51.921ms; -39.180ms] or [-25.766%; -19.443%]

Known flaky benchmarks

These benchmarks are marked as flaky and will not trigger a failure. Modify FLAKY_BENCHMARKS_REGEX to control which benchmarks are marked as flaky.

scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan net6.0

  • unstable execution_time [-13.758ms; +9.441ms] or [-9.819%; +6.738%]
  • unstable throughput [-3049.234op/s; +19842.477op/s] or [-1.766%; +11.495%]

scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan netcoreapp3.1

  • unstable execution_time [-33.021ms; -8.623ms] or [-29.232%; -7.634%]

scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_AddEvent_Sampled net6.0

  • unstable execution_time [-50.576ms; -15.090ms] or [-29.362%; -8.761%]
  • unstable throughput [-7196.069op/s; +14789.682op/s] or [-5.100%; +10.482%]

scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_GetContext_Sampled netcoreapp3.1

  • unstable execution_time [-37.139ms; -13.374ms] or [-33.315%; -11.997%]

scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_SetAttributes_Sampled net6.0

  • unstable execution_time [-11.005ms; +7.880ms] or [-7.216%; +5.167%]
  • unstable throughput [-7091.236op/s; +7199.573op/s] or [-5.462%; +5.546%]

scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_SetAttributes_Sampled netcoreapp3.1

  • unstable execution_time [-40.073ms; -1.097ms] or [-29.590%; -0.810%]

scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_SetStatus_Sampled net6.0

  • unstable execution_time [-16.288ms; +7.978ms] or [-11.672%; +5.717%]
  • unstable throughput [-8285.144op/s; +14996.172op/s] or [-4.708%; +8.521%]

scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_SetStatus_Sampled netcoreapp3.1

  • unstable execution_time [-32.994ms; -7.803ms] or [-29.639%; -7.009%]

scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_UpdateName_Sampled net6.0

  • unstable execution_time [-10.605ms; +27.039ms] or [-7.469%; +19.043%]
  • unstable throughput [-27251.970op/s; -2002.874op/s] or [-14.324%; -1.053%]

scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_UpdateName_Sampled netcoreapp3.1

  • unstable execution_time [+8.011ms; +32.945ms] or [+8.925%; +36.704%]

scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan net6.0

  • 🟩 allocated_mem [-96 bytes; -95 bytes] or [-6.126%; -6.119%]
  • unstable throughput [-38826.600op/s; -15881.883op/s] or [-18.835%; -7.704%]

scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan netcoreapp3.1

  • unstable execution_time [-18027.319µs; +17709.347µs] or [-16.213%; +15.927%]

scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_AddEvent_Sampled net6.0

  • 🟥 allocated_mem [+143 bytes; +144 bytes] or [+8.332%; +8.340%]
  • unstable execution_time [-60.212ms; -23.499ms] or [-32.897%; -12.839%]
  • unstable throughput [+8476.042op/s; +31561.865op/s] or [+6.904%; +25.707%]

scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_AddEvent_Sampled netcoreapp3.1

  • unstable execution_time [+2.300ms; +42.441ms] or [+2.107%; +38.890%]

scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_GetContext_Sampled net6.0

  • unstable execution_time [+24.336ms; +50.245ms] or [+17.697%; +36.538%]
  • unstable throughput [-18004.688op/s; +1810.495op/s] or [-11.487%; +1.155%]

scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_GetContext_Sampled netcoreapp3.1

  • unstable execution_time [-31.494ms; -6.557ms] or [-29.100%; -6.058%]

scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_RecordException_Sampled net6.0

  • 🟩 allocated_mem [-144 bytes; -143 bytes] or [-5.210%; -5.202%]
  • unstable execution_time [+11.192ms; +52.342ms] or [+8.259%; +38.626%]
  • unstable throughput [-29127.137op/s; -2772.182op/s] or [-26.001%; -2.475%]

scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_RecordException_Sampled netcoreapp3.1

  • unstable execution_time [-15.722ms; +19.703ms] or [-14.335%; +17.964%]

scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_SetAttributes_Sampled netcoreapp3.1

  • unstable execution_time [+1.675ms; +39.594ms] or [+1.459%; +34.469%]

scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_SetStatus_Sampled net6.0

  • 🟩 allocated_mem [-144 bytes; -143 bytes] or [-8.741%; -8.734%]
  • unstable execution_time [-4.109ms; +32.806ms] or [-2.867%; +22.890%]
  • unstable throughput [-13103.926op/s; +10465.669op/s] or [-9.379%; +7.491%]

scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_UpdateName_Sampled net6.0

  • unstable execution_time [-15.827ms; +19.560ms] or [-10.366%; +12.810%]
  • unstable throughput [-14370.237op/s; +12257.478op/s] or [-9.200%; +7.847%]

scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_UpdateName_Sampled netcoreapp3.1

  • unstable execution_time [-42.010ms; -2.295ms] or [-31.420%; -1.717%]

scenario:Benchmarks.Trace.ActivityBenchmark.StartStopWithChild net472

  • 🟥 throughput [-9804.460op/s; -8834.499op/s] or [-11.625%; -10.475%]

scenario:Benchmarks.Trace.ActivityBenchmark.StartStopWithChild net6.0

  • unstable execution_time [-66.811ms; -33.305ms] or [-33.332%; -16.616%]
  • unstable throughput [-46474.778op/s; -31226.426op/s] or [-39.064%; -26.247%]

scenario:Benchmarks.Trace.ActivityBenchmark.StartStopWithChild netcoreapp3.1

  • unstable execution_time [-73.544ms; -45.732ms] or [-36.990%; -23.002%]

scenario:Benchmarks.Trace.AgentWriterBenchmark.WriteAndFlushEnrichedTraces net472

  • 🟥 execution_time [+310.881ms; +326.314ms] or [+154.270%; +161.928%]
  • 🟥 throughput [-56.745op/s; -43.946op/s] or [-10.209%; -7.907%]

scenario:Benchmarks.Trace.AgentWriterBenchmark.WriteAndFlushEnrichedTraces net6.0

  • 🟥 execution_time [+97.099ms; +101.171ms] or [+76.714%; +79.931%]
  • 🟩 throughput [+86.407op/s; +99.496op/s] or [+11.392%; +13.118%]

scenario:Benchmarks.Trace.AgentWriterBenchmark.WriteAndFlushEnrichedTraces netcoreapp3.1

  • 🟥 execution_time [+83.126ms; +84.456ms] or [+73.563%; +74.740%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleMoreComplexBody net472

  • 🟥 allocated_mem [+1.308KB; +1.308KB] or [+27.528%; +27.540%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleMoreComplexBody net6.0

  • 🟥 allocated_mem [+439 bytes; +440 bytes] or [+9.299%; +9.310%]
  • 🟩 execution_time [-64.175ms; -45.050ms] or [-29.972%; -21.040%]
  • 🟥 throughput [-21045.464op/s; -7789.589op/s] or [-15.362%; -5.686%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleMoreComplexBody netcoreapp3.1

  • 🟥 allocated_mem [+1.272KB; +1.272KB] or [+27.500%; +27.510%]
  • 🟩 execution_time [-31.155ms; -14.433ms] or [-14.836%; -6.873%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleSimpleBody net472

  • 🟥 allocated_mem [+1.307KB; +1.307KB] or [+105.743%; +105.758%]
  • 🟥 throughput [-265489.186op/s; -260242.193op/s] or [-27.108%; -26.572%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleSimpleBody net6.0

  • 🟥 allocated_mem [+439 bytes; +440 bytes] or [+35.945%; +35.954%]
  • 🟩 execution_time [-115.506ms; -110.465ms] or [-51.511%; -49.263%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleSimpleBody netcoreapp3.1

  • 🟥 allocated_mem [+1.272KB; +1.272KB] or [+105.288%; +105.304%]
  • 🟩 execution_time [-89.143ms; -84.871ms] or [-44.496%; -42.364%]
  • 🟥 throughput [-130115.668op/s; -112805.030op/s] or [-18.695%; -16.208%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorMoreComplexBody net6.0

  • unstable execution_time [-43.734ms; -23.657ms] or [-22.066%; -11.936%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorMoreComplexBody netcoreapp3.1

  • unstable execution_time [-50.290ms; -30.439ms] or [-25.640%; -15.519%]
  • 🟩 throughput [+10683.306op/s; +13354.415op/s] or [+8.511%; +10.639%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorSimpleBody net6.0

  • 🟩 execution_time [-76.769ms; -75.311ms] or [-37.957%; -37.236%]
  • 🟩 throughput [+441942.729op/s; +460370.867op/s] or [+14.736%; +15.351%]

scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorSimpleBody netcoreapp3.1

  • unstable execution_time [-63.887ms; -39.151ms] or [-29.450%; -18.047%]
  • 🟩 throughput [+143329.286op/s; +204069.297op/s] or [+5.689%; +8.100%]

scenario:Benchmarks.Trace.Asm.AppSecEncoderBenchmark.EncodeArgs net472

  • 🟥 execution_time [+301.612ms; +316.329ms] or [+150.705%; +158.059%]

scenario:Benchmarks.Trace.Asm.AppSecEncoderBenchmark.EncodeArgs net6.0

  • unstable execution_time [+148.057ms; +186.357ms] or [+74.665%; +93.980%]

scenario:Benchmarks.Trace.Asm.AppSecEncoderBenchmark.EncodeArgs netcoreapp3.1

  • unstable execution_time [+250.712ms; +291.622ms] or [+126.289%; +146.897%]

scenario:Benchmarks.Trace.Asm.AppSecEncoderBenchmark.EncodeLegacyArgs net472

  • 🟥 execution_time [+297.902ms; +312.110ms] or [+146.318%; +153.296%]

scenario:Benchmarks.Trace.Asm.AppSecEncoderBenchmark.EncodeLegacyArgs net6.0

  • 🟥 execution_time [+228.052ms; +235.679ms] or [+111.486%; +115.215%]

scenario:Benchmarks.Trace.Asm.AppSecEncoderBenchmark.EncodeLegacyArgs netcoreapp3.1

  • 🟥 execution_time [+276.424ms; +292.322ms] or [+138.156%; +146.102%]

scenario:Benchmarks.Trace.Asm.AppSecWafBenchmark.RunWafRealisticBenchmarkWithAttack net6.0

  • 🟥 execution_time [+18.857µs; +42.452µs] or [+6.020%; +13.553%]
  • 🟥 throughput [-399.002op/s; -200.409op/s] or [-12.438%; -6.247%]

scenario:Benchmarks.Trace.AspNetCoreBenchmark.SendRequest net472

  • 🟥 execution_time [+299.849ms; +300.572ms] or [+149.655%; +150.016%]

scenario:Benchmarks.Trace.AspNetCoreBenchmark.SendRequest net6.0

  • unstable execution_time [+360.012ms; +372.259ms] or [+391.168%; +404.474%]
  • 🟥 throughput [-6896.192op/s; -6730.512op/s] or [-56.667%; -55.306%]

scenario:Benchmarks.Trace.AspNetCoreBenchmark.SendRequest netcoreapp3.1

  • unstable execution_time [+283.874ms; +342.268ms] or [+215.543%; +259.881%]
  • 🟥 throughput [-1212.086op/s; -996.289op/s] or [-11.734%; -9.645%]

scenario:Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark.WriteAndFlushEnrichedTraces net472

  • 🟥 execution_time [+305.801ms; +320.302ms] or [+140.604%; +147.272%]
  • 🟥 throughput [-688.856op/s; -672.181op/s] or [-62.417%; -60.906%]

scenario:Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark.WriteAndFlushEnrichedTraces net6.0

  • unstable execution_time [-47.530ms; +86.631ms] or [-20.255%; +36.919%]
  • 🟥 throughput [-719.345op/s; -624.653op/s] or [-47.980%; -41.665%]

scenario:Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark.WriteAndFlushEnrichedTraces netcoreapp3.1

  • 🟥 allocated_mem [+2.305KB; +2.308KB] or [+5.442%; +5.450%]
  • 🟥 execution_time [+340.902ms; +349.198ms] or [+203.899%; +208.861%]
  • 🟥 throughput [-734.520op/s; -700.141op/s] or [-51.144%; -48.750%]

scenario:Benchmarks.Trace.CharSliceBenchmark.OriginalCharSlice net6.0

  • 🟩 execution_time [-220.967µs; -213.042µs] or [-11.193%; -10.792%]
  • 🟩 throughput [+61.335op/s; +63.854op/s] or [+12.108%; +12.605%]

scenario:Benchmarks.Trace.CharSliceBenchmark.OriginalCharSlice netcoreapp3.1

  • 🟩 execution_time [-230.842µs; -221.370µs] or [-5.854%; -5.614%]
  • 🟩 throughput [+15.093op/s; +15.765op/s] or [+5.952%; +6.217%]

scenario:Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearch net472

  • 🟥 execution_time [+307.100ms; +321.457ms] or [+154.650%; +161.880%]

scenario:Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearch net6.0

  • unstable execution_time [+183.933ms; +237.107ms] or [+92.169%; +118.815%]

scenario:Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearch netcoreapp3.1

  • 🟥 execution_time [+300.276ms; +307.520ms] or [+150.846%; +154.485%]

scenario:Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearchAsync net472

  • 🟥 execution_time [+305.386ms; +319.248ms] or [+153.354%; +160.316%]

scenario:Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearchAsync net6.0

  • unstable execution_time [+212.942ms; +236.578ms] or [+105.290%; +116.977%]

scenario:Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearchAsync netcoreapp3.1

  • 🟥 execution_time [+307.395ms; +314.640ms] or [+155.801%; +159.473%]

scenario:Benchmarks.Trace.GraphQLBenchmark.ExecuteAsync net472

  • 🟥 execution_time [+304.172ms; +318.438ms] or [+152.667%; +159.827%]

scenario:Benchmarks.Trace.GraphQLBenchmark.ExecuteAsync net6.0

  • unstable execution_time [+127.751ms; +220.138ms] or [+63.672%; +109.718%]
  • 🟩 throughput [+25585.356op/s; +35715.236op/s] or [+5.080%; +7.092%]

scenario:Benchmarks.Trace.GraphQLBenchmark.ExecuteAsync netcoreapp3.1

  • 🟥 execution_time [+302.587ms; +309.332ms] or [+150.534%; +153.890%]

scenario:Benchmarks.Trace.ILoggerBenchmark.EnrichedLog net6.0

  • unstable execution_time [-48.615ms; -25.309ms] or [-22.606%; -11.769%]
  • unstable throughput [-170326.860op/s; -132788.113op/s] or [-46.725%; -36.427%]

scenario:Benchmarks.Trace.ILoggerBenchmark.EnrichedLog netcoreapp3.1

  • unstable execution_time [-52.256ms; -25.059ms] or [-26.212%; -12.570%]

scenario:Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatAspectBenchmark net6.0

  • 🟩 allocated_mem [-25.182KB; -25.159KB] or [-9.186%; -9.177%]
  • unstable execution_time [-67.343µs; -11.186µs] or [-13.310%; -2.211%]
  • unstable throughput [+62.400op/s; +265.681op/s] or [+3.114%; +13.258%]

scenario:Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatAspectBenchmark netcoreapp3.1

  • 🟩 execution_time [-88.676µs; -32.505µs] or [-15.367%; -5.633%]
  • 🟩 throughput [+118.709op/s; +272.072op/s] or [+6.782%; +15.544%]

scenario:Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatBenchmark net6.0

  • unstable execution_time [+3.721µs; +8.534µs] or [+8.795%; +20.171%]
  • 🟥 throughput [-3843.754op/s; -1870.085op/s] or [-16.181%; -7.872%]

scenario:Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatBenchmark netcoreapp3.1

  • unstable execution_time [-13.952µs; -4.454µs] or [-21.645%; -6.911%]
  • unstable throughput [+1405.410op/s; +3392.230op/s] or [+8.623%; +20.812%]

scenario:Benchmarks.Trace.Log4netBenchmark.EnrichedLog net472

  • 🟥 execution_time [+304.269ms; +318.258ms] or [+153.795%; +160.865%]

scenario:Benchmarks.Trace.Log4netBenchmark.EnrichedLog net6.0

  • 🟥 execution_time [+302.780ms; +305.610ms] or [+154.114%; +155.555%]

scenario:Benchmarks.Trace.Log4netBenchmark.EnrichedLog netcoreapp3.1

  • 🟥 execution_time [+302.144ms; +307.708ms] or [+151.260%; +154.046%]

scenario:Benchmarks.Trace.RedisBenchmark.SendReceive net6.0

  • unstable execution_time [-35.585ms; -10.017ms] or [-17.787%; -5.007%]
  • unstable throughput [-261391.986op/s; -204360.034op/s] or [-49.476%; -38.681%]

scenario:Benchmarks.Trace.RedisBenchmark.SendReceive netcoreapp3.1

  • unstable execution_time [-74.777ms; -45.374ms] or [-37.904%; -23.000%]

scenario:Benchmarks.Trace.SerilogBenchmark.EnrichedLog net472

  • 🟥 execution_time [+298.838ms; +313.046ms] or [+148.944%; +156.025%]

scenario:Benchmarks.Trace.SerilogBenchmark.EnrichedLog net6.0

  • 🟥 execution_time [+304.340ms; +310.650ms] or [+152.825%; +155.994%]

scenario:Benchmarks.Trace.SerilogBenchmark.EnrichedLog netcoreapp3.1

  • 🟥 execution_time [+304.283ms; +311.506ms] or [+154.313%; +157.976%]

scenario:Benchmarks.Trace.SingleSpanAspNetCoreBenchmark.SingleSpanAspNetCore net472

  • 🟥 execution_time [+300.372ms; +300.998ms] or [+149.827%; +150.139%]
  • 🟩 throughput [+60026700.043op/s; +60298403.525op/s] or [+43.715%; +43.913%]

scenario:Benchmarks.Trace.SingleSpanAspNetCoreBenchmark.SingleSpanAspNetCore net6.0

  • unstable execution_time [+379.897ms; +391.214ms] or [+472.470%; +486.545%]
  • 🟥 throughput [-7430.899op/s; -7232.563op/s] or [-57.445%; -55.911%]

scenario:Benchmarks.Trace.SingleSpanAspNetCoreBenchmark.SingleSpanAspNetCore netcoreapp3.1

  • 🟥 execution_time [+303.442ms; +306.261ms] or [+151.350%; +152.756%]
  • 🟥 throughput [-30125389.542op/s; -28790958.329op/s] or [-13.344%; -12.753%]

scenario:Benchmarks.Trace.SpanBenchmark.StartFinishScope net6.0

  • 🟩 execution_time [-100.243ms; -98.712ms] or [-49.097%; -48.347%]
  • 🟩 throughput [+67095.720op/s; +77952.639op/s] or [+6.265%; +7.278%]

scenario:Benchmarks.Trace.SpanBenchmark.StartFinishScope netcoreapp3.1

  • unstable execution_time [-87.967ms; -66.757ms] or [-44.512%; -33.780%]

scenario:Benchmarks.Trace.SpanBenchmark.StartFinishSpan net6.0

  • 🟩 execution_time [-89.749ms; -85.598ms] or [-46.760%; -44.597%]
  • 🟩 throughput [+81462.085op/s; +112197.637op/s] or [+6.305%; +8.684%]

scenario:Benchmarks.Trace.SpanBenchmark.StartFinishSpan netcoreapp3.1

  • unstable execution_time [-54.725ms; -27.854ms] or [-26.887%; -13.685%]
  • 🟩 throughput [+76803.967op/s; +86756.225op/s] or [+7.628%; +8.616%]

scenario:Benchmarks.Trace.SpanBenchmark.StartFinishTwoScopes net6.0

  • unstable execution_time [-72.907ms; -46.555ms] or [-36.409%; -23.249%]
  • unstable throughput [-103726.510op/s; -29590.064op/s] or [-18.835%; -5.373%]

scenario:Benchmarks.Trace.SpanBenchmark.StartFinishTwoScopes netcoreapp3.1

  • 🟩 execution_time [-97.803ms; -93.658ms] or [-49.140%; -47.057%]

scenario:Benchmarks.Trace.TraceAnnotationsBenchmark.RunOnMethodBegin net6.0

  • unstable execution_time [-82.415ms; -53.968ms] or [-41.222%; -26.993%]
  • unstable throughput [-207775.721op/s; -63017.816op/s] or [-23.214%; -7.041%]

scenario:Benchmarks.Trace.TraceAnnotationsBenchmark.RunOnMethodBegin netcoreapp3.1

  • unstable execution_time [-29.084ms; -5.924ms] or [-14.771%; -3.008%]

Known flaky benchmarks without significant changes:

  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan net472
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_AddEvent_Sampled net472
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_AddEvent_Sampled netcoreapp3.1
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_GetContext_Sampled net472
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_GetContext_Sampled net6.0
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_SetAttributes_Sampled net472
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_SetStatus_Sampled net472
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.ActivityBenchmark.StartSpan_UpdateName_Sampled net472
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan net472
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_AddEvent_Sampled net472
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_GetContext_Sampled net472
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_RecordException_Sampled net472
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_SetAttributes_Sampled net472
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_SetAttributes_Sampled net6.0
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_SetStatus_Sampled net472
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_SetStatus_Sampled netcoreapp3.1
  • scenario:Benchmarks.OpenTelemetry.InstrumentedApi.Trace.TelemetrySpanBenchmark.StartSpan_UpdateName_Sampled net472
  • scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorMoreComplexBody net472
  • scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorSimpleBody net472
  • scenario:Benchmarks.Trace.Asm.AppSecWafBenchmark.RunWafRealisticBenchmark net472
  • scenario:Benchmarks.Trace.Asm.AppSecWafBenchmark.RunWafRealisticBenchmark net6.0
  • scenario:Benchmarks.Trace.Asm.AppSecWafBenchmark.RunWafRealisticBenchmark netcoreapp3.1
  • scenario:Benchmarks.Trace.Asm.AppSecWafBenchmark.RunWafRealisticBenchmarkWithAttack net472
  • scenario:Benchmarks.Trace.Asm.AppSecWafBenchmark.RunWafRealisticBenchmarkWithAttack netcoreapp3.1
  • scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSlice net472
  • scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSlice net6.0
  • scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSlice netcoreapp3.1
  • scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSliceWithPool net472
  • scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSliceWithPool net6.0
  • scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSliceWithPool netcoreapp3.1
  • scenario:Benchmarks.Trace.CharSliceBenchmark.OriginalCharSlice net472
  • scenario:Benchmarks.Trace.ILoggerBenchmark.EnrichedLog net472
  • scenario:Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatAspectBenchmark net472
  • scenario:Benchmarks.Trace.Iast.StringAspectsBenchmark.StringConcatBenchmark net472
  • scenario:Benchmarks.Trace.RedisBenchmark.SendReceive net472
  • scenario:Benchmarks.Trace.SpanBenchmark.StartFinishScope net472
  • scenario:Benchmarks.Trace.SpanBenchmark.StartFinishSpan net472
  • scenario:Benchmarks.Trace.SpanBenchmark.StartFinishTwoScopes net472
  • scenario:Benchmarks.Trace.TraceAnnotationsBenchmark.RunOnMethodBegin net472

@sameerank sameerank force-pushed the sameerank/FFL-1946/add-flag-eval-metrics branch 2 times, most recently from a73acc0 to ea7bc67 Compare March 25, 2026 15:27
@sameerank sameerank force-pushed the sameerank/FFL-1946/add-flag-eval-metrics branch from ea7bc67 to 8865c3f Compare April 3, 2026 19:33
Comment on lines +16 to +17
<!-- OpenFeature 2.3.0 has transitive dependencies that warn about net6.0 support -->
<SuppressTfmSupportBuildWarnings>true</SuppressTfmSupportBuildWarnings>

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We needed to upgrade OpenFeature from 2.0.0 to 2.3.0 because it changed the FinallyAsync hook signature to include FlagEvaluationDetails<T> which is needed to record flag evaluation metrics and this matches how it was implemented in the Go SDK

OpenFeature version FinallyAsync signature
2.0.0 ValueTask FinallyAsync<T>(HookContext<T> context, ...)
2.3.0 ValueTask FinallyAsync<T>(HookContext<T> context, FlagEvaluationDetails<T> details, ...)

I needed to suppress the warning to get the bump_package_versions CI to pass, and given the presence of SuppressTfmSupportBuildWarnings in other places in the repo I'm guessing this is acceptable. The warning comes from transitive dependencies (Microsoft.Extensions.*, System.Collections.Immutable, etc.) that ship .NET 9.0 versions and emit a build-time warning saying they haven't been "tested" with net6.0.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AS mentioned in the comment above, the TargetFrameworks used here should match the version used by the referenced OpenFeature version, so adding SuppressTfmSupportBuildWarnings here likely isn't the right fix - instead you should make (breaking) changes to the TargetFrameworks.

However, I would really try to work out if you can avoid bumping the OpenFeature version. This is a breaking change (as it drop support for TFMs in exactly the way you're seeing here). Is there another approach we could use? Should we instead use progressive enhancement to add these hooks?

What is the documented support policy here? Because breaking changes like this are generally not how we update things in the core tracer.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I'm currently leaning towards accepting the breaking change in OpenFeature https://github.com/DataDog/dd-trace-dotnet/pull/8367/changes#r3075858040, I updated the TFMs to match: net462;netstandard2.0;net8.0;net9.0. This drops net6.0 (EOL since Nov 2024) and adds net9.0. Removed SuppressTfmSupportBuildWarnings. Customers on .NET 6 can't use OpenFeature 2.3.0 anyway, so this aligns with their supported platforms.

f2faacd

@sameerank sameerank force-pushed the sameerank/FFL-1946/add-flag-eval-metrics branch 4 times, most recently from b2040ea to c448add Compare April 7, 2026 15:27
sameerank added a commit to DataDog/system-tests that referenced this pull request Apr 7, 2026
Remove irrelevant skips for:
- Test_FFE_Eval_Metric_Parse_Error_Invalid_Regex
- Test_FFE_Eval_No_Config_Loaded

The .NET SDK uses a managed evaluator (not libdatadog), and now
correctly returns PARSE_ERROR for invalid regex patterns and
PROVIDER_NOT_READY when no config is loaded.

Related: DataDog/dd-trace-dotnet#8367
@sameerank sameerank force-pushed the sameerank/FFL-1946/add-flag-eval-metrics branch from c448add to 9c81be3 Compare April 7, 2026 16:33
Update test assertions to match semantic evaluation reason logic:
- CreateSimpleFlag has shards → expect Split (not TargetingMatch)
- CreateExposureFlag has no rules/shards → expect Static
- CreateTimeBasedFlagWithDates has no rules/shards → expect Static

The evaluation reason logic is:
- hasShards → Split
- hasRules → TargetingMatch
- neither → Static

This aligns with system-tests fixtures and OpenFeature conventions.
- Wrap ArgumentException from invalid regex patterns in FormatException
  to produce PARSE_ERROR instead of GENERAL error
- Return "PROVIDER_NOT_READY" error string when evaluator is null for
  proper OpenFeature error mapping
- Handle RC reset (empty config list) by clearing evaluator to trigger
  PROVIDER_NOT_READY on subsequent evaluations
- Add unit test for invalid regex PARSE_ERROR behavior
- Error property now contains human-readable messages for debugging
- FlagMetadata["errorCode"] contains OpenFeature codes for programmatic handling
- ToErrorType() reads from metadata instead of error string
- Updated unit tests to verify both error and errorCode

This pattern is consistent with how Go and Python SDKs handle errors.
@sameerank sameerank force-pushed the sameerank/FFL-1946/add-flag-eval-metrics branch 2 times, most recently from ef1e68e to 56e2d16 Compare April 7, 2026 20:44
- Add ToOpenFeatureReason() to map our enum to OpenFeature Reason constants
- Simplify ReasonToString() to just ToLowerInvariant() since OpenFeature
  already uses UPPER_SNAKE_CASE format
- Fix test assertions to check human-readable Error messages and
  errorCode in FlagMetadata separately
@sameerank sameerank force-pushed the sameerank/FFL-1946/add-flag-eval-metrics branch from 56e2d16 to 136880a Compare April 7, 2026 20:45
The Datadog.FeatureFlags.OpenFeature package requires OpenFeature >= 2.3.0
for the Reason constants and FinallyAsync with details parameter used in
the flag evaluation metrics hook.

Update PackageVersionsGeneratorDefinitions.json:
- MinVersion: 2.0.0 -> 2.3.0
- SpecificVersions: Replace 2.0.0 with 2.3.0
@sameerank sameerank force-pushed the sameerank/FFL-1946/add-flag-eval-metrics branch from 439210c to 49262b1 Compare April 7, 2026 22:32
Run ./tracer/build.sh GeneratePackageVersions to update the generated
.props and .cs files after changing MinVersion in
PackageVersionsGeneratorDefinitions.json.

See docs/development/AutomaticInstrumentation.md for the process.
defaultValue,
EvaluationReason.Error,
error: "PROVIDER_NOT_READY",
error: "No config loaded",

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using human readable messages here is consistent with the other SDKs:

  1. Error property → Human-readable message for logging/debugging
  2. FlagMetadata["errorCode"]OpenFeature error code for programmatic handling

The Go SDK uses the same pattern - human-readable Go errors that get mapped to OpenFeature ResolutionError

And the Python SDK also has a human-readable message alongside an error code.

There remain some small work to do, but I think this is overall a step towards being more consistent

OpenFeature Code .NET Human-Readable Go Human-Readable Python Human-Readable
PROVIDER_NOT_READY "No config loaded" N/A (different flow) "No FFE configuration loaded"
FLAG_NOT_FOUND "Flag not found" "flag not found" "Flag not found"
TYPE_MISMATCH "Type mismatch" "type mismatch" "Type mismatch"
PARSE_ERROR (exception message) (exception message) (exception message)
TARGETING_KEY_MISSING "Targeting key missing" N/A N/A
GENERAL (exception message) N/A N/A


/// <inheritdoc/>
public override ValueTask FinallyAsync<T>(
HookContext<T> context,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unhandled exception in hook

FinallyAsync calls _metrics.Record unconditionally; if Record throws (e.g. TagList overflow or counter disposed), the exception propagates out of the OpenFeature hook pipeline with no logging or recovery, silently breaking flag evaluation for the caller.

Consider wrapping the body of FinallyAsync in a try/catch that logs the exception at Debug level and returns default, so a metrics failure never surfaces to the application.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrapped the FinallyAsync body: ae04267

<Version>2.0.1</Version>
<!-- These target frameworks should match the values exposed in the OpenFeature package referenced below-->
<TargetFrameworks>net462;netstandard2.0;net6.0;net8.0</TargetFrameworks>
<TargetFrameworks>net462;netstandard2.0;net8.0;net9.0</TargetFrameworks>

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test coverage for metrics hook

Consider adding tests for the un-tested logic in the hook — specifically the branching in FinallyAsync (error-type extraction, allocation-key extraction from metadata, unknown-reason fallback). A MeterListener or IMeterFactory test double could assert Counter<long> increments with expected TagList values.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran into issues initially adding tests with MeterListener. Both Datadog.Trace and Datadog.FeatureFlags.OpenFeature define ValueType and EvaluationReason (via shared source files), causing CS0433 ambiguity errors when both are referenced in the test project Datadog.Trace.Tests.

error CS0433: The type 'ValueType' exists in both 'Datadog.FeatureFlags.OpenFeature' and 'Datadog.Trace'
error CS0433: The type 'EvaluationReason' exists in both 'Datadog.FeatureFlags.OpenFeature' and 'Datadog.Trace'

One workaround is to create a separate Datadog.FeatureFlags.OpenFeature.Tests project (without referencing Datadog.Trace.Tests), so I went ahead and did that: fafbc0d

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather we didn't create a new test project if possible, especially at this stage, as it generally slows down CI which is an issue we're having more and more 🙁

An alternative approach (which we uses for other tests, is to specify an alias when referencing the project like this:

https://github.com/DataDog/dd-trace-dotnet/blob/master/tracer/test/Datadog.Trace.Tests/Datadog.Trace.Tests.csproj#L44

and then you can reference it like this:

using ManualTracer = DatadogTraceManual::Datadog.Trace.Tracer;

@sameerank sameerank May 7, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've decided to remove the tests

  • They're minimally valuable: The 3 unit tests cover simple transformations (null checks, a switch statement).
  • They're challenging to set up correctly: The proposed alias approach caused dependency conflicts: OpenFeature 2.3.0 transitively pulls in Microsoft.Extensions.* 9.0.0, which conflicts with the existing Microsoft.AspNetCore 2.2.0 packages in Datadog.Trace.Tests (CS0433 type ambiguity errors).
  • And we already have equivalent test coverage with test_flag_eval_metrics in system tests
    • tests/ffe/test_flag_eval_metrics.py::Test_FFE_Eval_Reason_* - tests reason values
    • tests/ffe/test_flag_eval_metrics.py::Test_FFE_Eval_Error_* - tests error type extraction
    • Allocation key extraction is tested via the full metrics pipeline

Comment thread tracer/src/Datadog.Trace/FeatureFlags/FeatureFlagsModule.cs
Comment thread tracer/src/Datadog.Trace/FeatureFlags/FeatureFlagsEvaluator.cs

@typotter typotter left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally, LGTM. Couple of missing test suggestions and an issue with the SPLIT reason not taking precedence.

Comment thread tracer/src/Datadog.Trace/FeatureFlags/FeatureFlagsEvaluator.cs
sameerank added 5 commits May 5, 2026 23:05
- Create dedicated test project to avoid CS0433 type conflicts
- Test error-type extraction, allocation-key extraction, unknown-reason fallback
- Uses MeterListener to verify counter increments with expected tags
Addresses typotter's review feedback (Comment 3) on PR #8367:
- Tests that RegisterOnNewConfigEventHandler callback is invoked when config is removed
- Tests that Evaluate returns PROVIDER_NOT_READY after RC reset

Uses RcmSubscriptionManagerMock to simulate remote config changes.

@andrewlock andrewlock left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates - I have a few minor suggestions, but the main one is to move the FlagEvalMetricsHookTests back into Datadog.Trace.Tests, and use aliases to avoid the ambiguity if required 🙂

Comment on lines -971 to 975
"MinVersion": "2.0.0",
"MinVersion": "2.3.0",
"MaxVersionExclusive": "3.0.0",
"SpecificVersions": [
"2.0.0",
"2.3.0",
"2.10.0"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with doing 3 as long as we're explicit about it, which we're being here, so I think that's ok 🙂 And also a good point about being experimental 👍

<Version>2.0.1</Version>
<!-- These target frameworks should match the values exposed in the OpenFeature package referenced below-->
<TargetFrameworks>net462;netstandard2.0;net6.0;net8.0</TargetFrameworks>
<TargetFrameworks>net462;netstandard2.0;net8.0;net9.0</TargetFrameworks>

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather we didn't create a new test project if possible, especially at this stage, as it generally slows down CI which is an issue we're having more and more 🙁

An alternative approach (which we uses for other tests, is to specify an alias when referencing the project like this:

https://github.com/DataDog/dd-trace-dotnet/blob/master/tracer/test/Datadog.Trace.Tests/Datadog.Trace.Tests.csproj#L44

and then you can reference it like this:

using ManualTracer = DatadogTraceManual::Datadog.Trace.Tracer;

Comment thread tracer/src/Datadog.FeatureFlags.OpenFeature/FlagEvalMetricsHook.cs Outdated
Comment thread tracer/test/Datadog.FeatureFlags.OpenFeature.Tests/FlagEvalMetricsHookTests.cs Outdated
return new TracerSettings(new NameValueConfigurationSource(collection));
}

private class RcmSubscriptionManagerMock : IRcmSubscriptionManager

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code already exists as a nested type in more than one class. I think it's time to extract it to its own standalone type and remove the duplication 🙂

@sameerank sameerank May 7, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see 3 implementations with slightly different behaviors:

  • DynamicInstrumentationTests - manages ProductKeys on Replace/Unsubscribe
  • SymbolUploaderTest - supports multiple subscriptions + has Update() method
  • FeatureFlagsModuleTests - simplest, just stores LastSubscription

I went with the simplest 4726a7d assuming we can refactor again later when we need to unify more

// - TargetingMatch: Allocation had targeting rules that matched
// - Split: No rules, but resolved via percentage split (shards)
// - Static: No rules, no shards - simple static value
var reason = hadRules ? EvaluationReason.TargetingMatch

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment here seemed to disappear. Per the RFC/Spec, SPLIT overrides TARGETING_MATCH, not the other way around.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in fd561bb

sameerank and others added 7 commits May 6, 2026 15:31
…gs.OpenFeature.csproj

Co-authored-by: Andrew Lock <andrew.lock@datadoghq.com>
…NG_MATCH

Per the FFE spec, SPLIT should override TARGETING_MATCH when both rules
and shards are present. Updated the reason determination logic to check
hadShards first.
Per review feedback - preserves stack trace for debugging.
Clarify that the NET8+ constraint is due to MeterListener (NET6+),
not OpenFeature which supports all frameworks.
Reverts fafbc0d and 1d78ab2.

The separate test project added CI overhead for minimal value:
- Only 3 unit tests covering simple transformations (null checks, switch statement)
- System-tests already provide equivalent coverage:
  - tests/ffe/test_flag_eval_metrics.py::Test_FFE_Eval_Reason_* tests reason values
  - tests/ffe/test_flag_eval_metrics.py::Test_FFE_Eval_Error_* tests error type extraction
  - Allocation key extraction is tested via the full metrics pipeline

Additionally, moving these tests to Datadog.Trace.Tests (per review feedback) would
cause dependency conflicts - OpenFeature 2.3.0's transitive deps conflict with the
existing Microsoft.AspNetCore 2.2.0 packages.
The "Log at debug level and swallow the exception" comment was
redundant given the code is self-explanatory.
@sameerank sameerank force-pushed the sameerank/FFL-1946/add-flag-eval-metrics branch 2 times, most recently from 0963903 to 4726a7d Compare May 7, 2026 21:46

@andrewlock andrewlock left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks all the work on this!

catch (Exception ex)
{
// Metrics recording should never break flag evaluation.
System.Diagnostics.Debug.WriteLine($"[Datadog] FlagEvalMetricsHook.FinallyAsync failed: {ex}");

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just so you're aware, we'll basically never know if this is happening in production... There's an enhancement we could/should make which is to have an explicit method in Datadog.Trace.Manual which is intended specifically for reporting errors to our telemetry backend (where the errors are our errors, not errors in API calling).

I'm undecided whether we should do it straight away to catch any issues here, but either way, not for this PR!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep understood. These logs are only observable in dev with debug builds and agreed it would be nice in the future to have visibility in prod.

Appreciate the careful review and all the feedback!

@sameerank sameerank merged commit d34f526 into master May 8, 2026
143 checks passed
@sameerank sameerank deleted the sameerank/FFL-1946/add-flag-eval-metrics branch May 8, 2026 16:33
@github-actions github-actions Bot added this to the vNext-v3 milestone May 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking-change docker_image_artifacts Use to label PRs for which you would need a Docker Image created for. feature_flags

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants