Fix Batching sink tests flakiness.#8081
Conversation
Execution-Time Benchmarks Report ⏱️Execution-time results for samples comparing This PR (8081) and master. ✅ No regressions detected - check the details below Full Metrics ComparisonFakeDbCommand
HttpMessageHandler
Comparison explanationExecution-time benchmarks measure the whole time it takes to execute a program, and are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are highlighted in **red**. The following thresholds were used for comparing the execution times:
Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard. Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph). Duration chartsFakeDbCommand (.NET Framework 4.8)gantt
title Execution time (ms) FakeDbCommand (.NET Framework 4.8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8081) - mean (69ms) : 67, 70
master - mean (69ms) : 67, 70
section Bailout
This PR (8081) - mean (72ms) : 71, 73
master - mean (72ms) : 71, 73
section CallTarget+Inlining+NGEN
This PR (8081) - mean (1,016ms) : 942, 1090
master - mean (1,024ms) : 938, 1110
FakeDbCommand (.NET Core 3.1)gantt
title Execution time (ms) FakeDbCommand (.NET Core 3.1)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8081) - mean (106ms) : 103, 109
master - mean (106ms) : 104, 109
section Bailout
This PR (8081) - mean (107ms) : 106, 108
master - mean (107ms) : 106, 108
section CallTarget+Inlining+NGEN
This PR (8081) - mean (740ms) : 690, 790
master - mean (738ms) : 685, 792
FakeDbCommand (.NET 6)gantt
title Execution time (ms) FakeDbCommand (.NET 6)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8081) - mean (94ms) : 91, 96
master - mean (93ms) : 91, 95
section Bailout
This PR (8081) - mean (94ms) : 93, 95
master - mean (94ms) : 94, 95
section CallTarget+Inlining+NGEN
This PR (8081) - mean (718ms) : 684, 751
master - mean (711ms) : 667, 755
FakeDbCommand (.NET 8)gantt
title Execution time (ms) FakeDbCommand (.NET 8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8081) - mean (92ms) : 90, 95
master - mean (92ms) : 90, 95
section Bailout
This PR (8081) - mean (93ms) : 92, 94
master - mean (93ms) : 92, 94
section CallTarget+Inlining+NGEN
This PR (8081) - mean (637ms) : 623, 651
master - mean (636ms) : 623, 650
HttpMessageHandler (.NET Framework 4.8)gantt
title Execution time (ms) HttpMessageHandler (.NET Framework 4.8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8081) - mean (194ms) : 190, 199
master - mean (194ms) : 188, 200
section Bailout
This PR (8081) - mean (197ms) : 194, 200
master - mean (198ms) : 195, 201
section CallTarget+Inlining+NGEN
This PR (8081) - mean (1,134ms) : 1061, 1207
master - mean (1,140ms) : 1054, 1227
HttpMessageHandler (.NET Core 3.1)gantt
title Execution time (ms) HttpMessageHandler (.NET Core 3.1)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8081) - mean (278ms) : 272, 285
master - mean (277ms) : 271, 283
section Bailout
This PR (8081) - mean (281ms) : 275, 286
master - mean (277ms) : 274, 280
section CallTarget+Inlining+NGEN
This PR (8081) - mean (935ms) : 891, 978
master - mean (924ms) : 876, 972
HttpMessageHandler (.NET 6)gantt
title Execution time (ms) HttpMessageHandler (.NET 6)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8081) - mean (272ms) : 267, 277
master - mean (271ms) : 265, 277
section Bailout
This PR (8081) - mean (272ms) : 267, 277
master - mean (271ms) : 267, 274
section CallTarget+Inlining+NGEN
This PR (8081) - mean (924ms) : 871, 977
master - mean (928ms) : 877, 979
HttpMessageHandler (.NET 8)gantt
title Execution time (ms) HttpMessageHandler (.NET 8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8081) - mean (271ms) : 265, 277
master - mean (272ms) : 265, 278
section Bailout
This PR (8081) - mean (271ms) : 267, 275
master - mean (270ms) : 267, 273
section CallTarget+Inlining+NGEN
This PR (8081) - mean (837ms) : 808, 866
master - mean (831ms) : 811, 850
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
BenchmarksBenchmark execution time: 2026-01-22 16:18:02 Comparing candidate commit cbe9d4d in PR branch Some scenarios are present only in baseline or only in candidate runs. If you didn't create or remove some scenarios in your branch, this maybe a sign of crashed benchmarks 💥💥💥 Scenarios present only in baseline:
Found 6 performance improvements and 7 performance regressions! Performance is the same for 157 metrics, 16 unstable metrics. scenario:Benchmarks.Trace.ActivityBenchmark.StartStopWithChild net472
scenario:Benchmarks.Trace.ActivityBenchmark.StartStopWithChild net6.0
scenario:Benchmarks.Trace.ActivityBenchmark.StartStopWithChild netcoreapp3.1
scenario:Benchmarks.Trace.AgentWriterBenchmark.WriteAndFlushEnrichedTraces netcoreapp3.1
scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleSimpleBody netcoreapp3.1
scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorMoreComplexBody net6.0
scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorSimpleBody netcoreapp3.1
scenario:Benchmarks.Trace.Asm.AppSecEncoderBenchmark.EncodeLegacyArgs netcoreapp3.1
scenario:Benchmarks.Trace.AspNetCoreBenchmark.SendRequest net6.0
scenario:Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark.WriteAndFlushEnrichedTraces net6.0
scenario:Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearchAsync netcoreapp3.1
scenario:Benchmarks.Trace.Log4netBenchmark.EnrichedLog netcoreapp3.1
scenario:Benchmarks.Trace.SpanBenchmark.StartFinishScope netcoreapp3.1
|
|
@codex review |
|
To use Codex here, create a Codex account and connect to github. |
bouwkast
left a comment
There was a problem hiding this comment.
We could probably remove one of the tests as they appear to be identical now, but I think we're changing the ordering of what we are testing, but if it simply relies on polling it probably isn't an effective test anymore.
| await sink.FlushAsync(); | ||
| await sink.DisposeAsync(); |
There was a problem hiding this comment.
Granted I think these tests are old and it was noted that they were loosely defined when they came it, but I think the initial ordering was expected.
As they are written now they appear to be the same and we've lost that initial expectation, but I'm unsure how accurate that expectation is/was looking at the FlushAsync/DisposeAsync flow
Maybe we could just remove this test now ?
|
Note: After a small conversation, @bouwkast and I agreed that marking the tests as flaky would be the best approach here. |
Summary of changes
Fixes race conditions in
BatchingSinkTeststhat caused intermittent test failures in CI.Reason for change
Failing tests are WhenRunning_AndAnEventIsQueued_ItIsWrittenToABatch and WhenRunning_AndAnEventIsQueued_ItIsWrittenToABatchOnDispose
Both tests were failing intermittently with:
The tests have been marked as flaky until solved.
Implementation details
Test coverage
Other details