[EL] DEBUG-4879 Support multiple expression compilations#7994
[EL] DEBUG-4879 Support multiple expression compilations#7994dudikeleti merged 3 commits intomasterfrom
Conversation
Execution-Time Benchmarks Report ⏱️Execution-time results for samples comparing This PR (7994) and master. ✅ No regressions detected - check the details below Full Metrics ComparisonFakeDbCommand
HttpMessageHandler
Comparison explanationExecution-time benchmarks measure the whole time it takes to execute a program, and are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are highlighted in **red**. The following thresholds were used for comparing the execution times:
Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard. Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph). Duration chartsFakeDbCommand (.NET Framework 4.8)gantt
title Execution time (ms) FakeDbCommand (.NET Framework 4.8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (7994) - mean (69ms) : 67, 70
master - mean (69ms) : 67, 70
section Bailout
This PR (7994) - mean (73ms) : 72, 73
master - mean (72ms) : 71, 74
section CallTarget+Inlining+NGEN
This PR (7994) - mean (1,010ms) : 961, 1059
master - mean (1,017ms) : 941, 1094
FakeDbCommand (.NET Core 3.1)gantt
title Execution time (ms) FakeDbCommand (.NET Core 3.1)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (7994) - mean (107ms) : 104, 110
master - mean (107ms) : 104, 109
section Bailout
This PR (7994) - mean (107ms) : 106, 109
master - mean (108ms) : 106, 109
section CallTarget+Inlining+NGEN
This PR (7994) - mean (750ms) : 689, 812
master - mean (749ms) : 707, 791
FakeDbCommand (.NET 6)gantt
title Execution time (ms) FakeDbCommand (.NET 6)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (7994) - mean (94ms) : 91, 98
master - mean (94ms) : 92, 96
section Bailout
This PR (7994) - mean (95ms) : 94, 96
master - mean (95ms) : 94, 96
section CallTarget+Inlining+NGEN
This PR (7994) - mean (722ms) : 663, 781
master - mean (721ms) : 684, 759
FakeDbCommand (.NET 8)gantt
title Execution time (ms) FakeDbCommand (.NET 8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (7994) - mean (93ms) : 90, 97
master - mean (93ms) : 91, 95
section Bailout
This PR (7994) - mean (94ms) : 93, 96
master - mean (94ms) : 92, 96
section CallTarget+Inlining+NGEN
This PR (7994) - mean (640ms) : 623, 656
master - mean (637ms) : 626, 647
HttpMessageHandler (.NET Framework 4.8)gantt
title Execution time (ms) HttpMessageHandler (.NET Framework 4.8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (7994) - mean (194ms) : 190, 198
master - mean (195ms) : 188, 201
section Bailout
This PR (7994) - mean (198ms) : 195, 200
master - mean (198ms) : 195, 201
section CallTarget+Inlining+NGEN
This PR (7994) - mean (1,131ms) : 1067, 1195
master - mean (1,129ms) : 1074, 1185
HttpMessageHandler (.NET Core 3.1)gantt
title Execution time (ms) HttpMessageHandler (.NET Core 3.1)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (7994) - mean (278ms) : 271, 286
master - mean (279ms) : 273, 284
section Bailout
This PR (7994) - mean (280ms) : 275, 285
master - mean (280ms) : 275, 284
section CallTarget+Inlining+NGEN
This PR (7994) - mean (938ms) : 901, 975
master - mean (938ms) : 894, 981
HttpMessageHandler (.NET 6)gantt
title Execution time (ms) HttpMessageHandler (.NET 6)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (7994) - mean (271ms) : 266, 276
master - mean (273ms) : 267, 278
section Bailout
This PR (7994) - mean (271ms) : 267, 276
master - mean (273ms) : 266, 279
section CallTarget+Inlining+NGEN
This PR (7994) - mean (933ms) : 890, 976
master - mean (930ms) : 873, 986
HttpMessageHandler (.NET 8)gantt
title Execution time (ms) HttpMessageHandler (.NET 8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (7994) - mean (272ms) : 265, 278
master - mean (271ms) : 266, 277
section Bailout
This PR (7994) - mean (271ms) : 267, 275
master - mean (271ms) : 267, 276
section CallTarget+Inlining+NGEN
This PR (7994) - mean (838ms) : 817, 860
master - mean (836ms) : 818, 854
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
f4cc27e to
85c6fe1
Compare
BenchmarksBenchmark execution time: 2026-01-21 19:09:30 Comparing candidate commit 005d3eb in PR branch Found 11 performance improvements and 9 performance regressions! Performance is the same for 159 metrics, 13 unstable metrics. scenario:Benchmarks.Trace.AgentWriterBenchmark.WriteAndFlushEnrichedTraces net6.0
scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleMoreComplexBody net6.0
scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.AllCycleMoreComplexBody netcoreapp3.1
scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorSimpleBody net6.0
scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorSimpleBody netcoreapp3.1
scenario:Benchmarks.Trace.Asm.AppSecEncoderBenchmark.EncodeArgs netcoreapp3.1
scenario:Benchmarks.Trace.AspNetCoreBenchmark.SendRequest net6.0
scenario:Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark.WriteAndFlushEnrichedTraces net6.0
scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSliceWithPool net6.0
scenario:Benchmarks.Trace.CharSliceBenchmark.OriginalCharSlice net6.0
scenario:Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearchAsync net472
scenario:Benchmarks.Trace.ElasticsearchBenchmark.CallElasticsearchAsync net6.0
scenario:Benchmarks.Trace.GraphQLBenchmark.ExecuteAsync netcoreapp3.1
scenario:Benchmarks.Trace.Log4netBenchmark.EnrichedLog net6.0
scenario:Benchmarks.Trace.RedisBenchmark.SendReceive netcoreapp3.1
scenario:Benchmarks.Trace.SingleSpanAspNetCoreBenchmark.SingleSpanAspNetCore net6.0
scenario:Benchmarks.Trace.SpanBenchmark.StartFinishTwoScopes netcoreapp3.1
|
GreenMatan
left a comment
There was a problem hiding this comment.
Left a few comments around allocation and multithreading
| // CRITICAL: Always use Value.GetType() for runtime type. If Value is null, | ||
| // we use the declared Type, but this can cause cache collisions when different | ||
| // runtime types share the same declared type (e.g., all inherit from Object). | ||
| var thisRuntimeType = invocationTarget.Value?.GetType(); |
There was a problem hiding this comment.
invocationTarget can be null?
There was a problem hiding this comment.
I mean, could we end up with NRE if we dereference invocationTarget.X throughout this method?
|
|
||
| if (members != null && members.Length > 0) | ||
| { | ||
| var types = new Type?[members.Length]; |
There was a problem hiding this comment.
can we minimize this allocation by reusing from a type pool? I'm worried we incur additional short-live allocation that might result in GC pressure if the method executes lots of time
There was a problem hiding this comment.
The problem that we can't pool them if we want to use them as a long lived cache key, I made a few changes to the implementation to reduce allocations in most cases
There was a problem hiding this comment.
The creation of ExpressionCacheKey (and the Type array in it) is done every time we try to evaluate an expression - in other words, whenever the probe executes. Keeping it in memory for long is true only for the ExpressionCacheKey that ends up being saved in the dictionary. We could think about an approach that takes the array from the pool for lookup, and materialize it as heap allocated array only when the key doesn't exist (first time we encounter the specific type combination) while all subsequent executions will take & return the array from the pool. My concern is that we may induce GC pressure for short lived arrays
There was a problem hiding this comment.
It turns out that I didn’t push my last commit. ExpressionCacheKey no longer exists.
| private readonly ConcurrentDictionary<ExpressionCacheKey, CompiledExpression<string>[]?> _compiledTemplatesByType = new(); | ||
| private readonly ConcurrentDictionary<ExpressionCacheKey, CompiledExpression<bool>?> _compiledConditionByType = new(); | ||
| private readonly ConcurrentDictionary<ExpressionCacheKey, CompiledExpression<double>?> _compiledMetricByType = new(); | ||
| private readonly ConcurrentDictionary<ExpressionCacheKey, KeyValuePair<CompiledExpression<bool>, KeyValuePair<string?, CompiledExpression<string>[]>[]>[]?> _compiledDecorationsByType = new(); |
There was a problem hiding this comment.
Can we improve this slow path of inquiring a ConcurrentDictionary, perhaps by using a Reader/Writer (slim) lock or other lock free approach?
We touch them in hot paths
There was a problem hiding this comment.
I made a few changes to the implementation to reduce dictionary access, but overall I don’t think it’s worth it.
f9a9774 to
37c5583
Compare
Replace 4 per-feature ConcurrentDictionary caches with a single bucketed cache keyed by (thisType, returnType, memberCount) Avoid per-evaluation allocations/hashing by matching member runtime types inside the bucket Ensure single compilation per type-combination with per-entry compile gating Add monomorphic fast-path in bucket (_last) and reduce allocations on misses Expose MethodScopeMembers.MemberCount to avoid scanning pooled member arrays Lazy-init span-decoration result list
ad2ab06 to
005d3eb
Compare
Summary of changes
ProbeExpressionEvaluatorto cache compiled expressions per runtime type combination (invocation target, return value, and scope members), instead of a singleLazy<>instance shared across all invocations.ExpressionCacheKeyto uniquely identify a compilation context using runtime types (Value.GetType()when available) and member runtime types to support polymorphic calls.ConcurrentDictionary<ExpressionCacheKey, …>caches for:CompiledExpression<string>[]?)CompiledExpression<bool>?)CompiledExpression<double>?)Reason for change
The previous implementation compiled expressions once (on first invocation) and reused them for all subsequent calls. This fails in polymorphic scenarios where:
object) but carry different concrete runtime types across invocations.In these cases, delegates compiled against the first observed type can fail later (typically invalid casts / expression binding mismatches), causing evaluation errors and incorrect probe results.
Implementation details
Evaluate()call:thisType:InvocationTarget.Value?.GetType()(fallback to declared type, thentypeof(object))returnType:Return.Value?.GetType()(fallback to declared type)members: runtime types captured fromScopeMember.Value?.GetType()(fallback to declaredScopeMember.Type)ConcurrentDictionary<ExpressionCacheKey, ...>TryAddExpressionCacheKeystruct:thisType,returnType, and member runtime types.ThisType,ReturnType, and the captured member type sequence (hash used as a fast-path).Test coverage
DebuggerExpressionLanguageTestsremain supported viaCompiledTemplates/CompiledConditionaccessors (now returning the first cached entry).Dependencies
Other details