Add container tags hash to DBM queries (if enabled)#8061
Add container tags hash to DBM queries (if enabled)#8061
Conversation
## Summary of changes
Replaced custom mutex guard with `std::lock_guard`, using
`std::recursive_mutex` instead of `CRITICAL_SECTION` in windows and
`std::mutex` with railings in Linux
## Reason for change
Some locks have been spotted in smoke test wich could be cause by the
lack of thread recursive lock in the `std::mutex`
## Implementation details
## Test coverage
## Other details
<!-- Fixes #{issue} -->
<!-- ⚠️ Note:
Where possible, please obtain 2 approvals prior to merging. Unless
CODEOWNERS specifies otherwise, for external teams it is typically best
to have one review from a team member, and one review from apm-dotnet.
Trivial changes do not require 2 reviews.
MergeQueue is NOT enabled in this repository. If you have write access
to the repo, the PR has 1-2 approvals (see above), and all of the
required checks have passed, you can use the Squash and Merge button to
merge the PR. If you don't have write access, or you need help, reach
out in the #apm-dotnet channel in Slack.
-->
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4fd01fab6f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| else | ||
| { | ||
| // PropagateDataViaComment (service) - this injects varius trace information as a comment in the query | ||
| if (tracer.Settings.InjectSqlBasehash && !string.IsNullOrEmpty(baseHash)) | ||
| { | ||
| tags.BaseHash = baseHash; |
There was a problem hiding this comment.
Set BaseHash even when DBM comment already present
This new BaseHash tagging only happens in the else branch when the command text is not already DBM-injected. In the cached‑command scenario (or when users pre‑inject DBM comments), alreadyInjected is true, so _dd.propagated_hash is never set on subsequent spans even though the query still carries ddsh in the SQL comment. If DBM looks up container tags by scanning recent spans for that hash, later queries can’t be enriched once the first span ages out. Consider setting tags.BaseHash whenever the feature is enabled (and baseHash is non‑empty), regardless of the alreadyInjected branch.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
hmm, yes, that's an interesting point, but I'm not sure we care, we only need one span with the hash to get the values, so we don't really need to tag all spans. I think in practice it works well like this.
BenchmarksBenchmark execution time: 2026-03-10 10:14:54 Comparing candidate commit 92e4c6e in PR branch Found 4 performance improvements and 12 performance regressions! Performance is the same for 157 metrics, 19 unstable metrics. scenario:Benchmarks.Trace.AgentWriterBenchmark.WriteAndFlushEnrichedTraces net6.0
scenario:Benchmarks.Trace.AgentWriterBenchmark.WriteAndFlushEnrichedTraces netcoreapp3.1
scenario:Benchmarks.Trace.Asm.AppSecBodyBenchmark.ObjectExtractorSimpleBody net6.0
scenario:Benchmarks.Trace.AspNetCoreBenchmark.SendRequest netcoreapp3.1
scenario:Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark.WriteAndFlushEnrichedTraces net472
scenario:Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark.WriteAndFlushEnrichedTraces netcoreapp3.1
scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSliceWithPool net472
scenario:Benchmarks.Trace.CharSliceBenchmark.OptimizedCharSliceWithPool net6.0
scenario:Benchmarks.Trace.CharSliceBenchmark.OriginalCharSlice net472
scenario:Benchmarks.Trace.DbCommandBenchmark.ExecuteNonQuery net472
scenario:Benchmarks.Trace.Log4netBenchmark.EnrichedLog netcoreapp3.1
scenario:Benchmarks.Trace.SpanBenchmark.StartFinishSpan net6.0
|
|
I just realized I need to put process tags in there too |
bouwkast
left a comment
There was a problem hiding this comment.
I think the main question that I have is that it appears this is correctly following the RFC in how we propagate the hash, but the merged Python implementation recomputes the hash.
But the RFC isn't precise enough in describing the hash and expected behavior / requirements for me to know which is correct really
| if (!string.IsNullOrEmpty(baseHash)) | ||
| { | ||
| propagatorStringBuilder.Append(',').Append(SqlCommentBaseHash).Append("='").Append(Uri.EscapeDataString(baseHash)).Append('\''); | ||
| } |
There was a problem hiding this comment.
If I understand the Python implementation correctly they appear to be recomputing the hash:
https://github.com/DataDog/dd-trace-py/blob/b5008005cd637a07e5c249f8d60ed07ee8328dc5/ddtrace/internal/process_tags/__init__.py#L92-L94
I think based on the linked document that the method here of just passing in the has as given from the agent is the correct approach, but seeing that Python has been merged I'm wondering if the approach has been changed outside of the RFC? If not, I think that the Python implementation must not work.
From the RFC:
The agent would compute a hash on low cardinality container tags that can be used to identify a workload
The agent will then propagate that hash to the tracing libraries that can then use it to complete outbound communications with container information
Could we validate this?
There was a problem hiding this comment.
the hash is recomputed here because it's only injected from the instrumentation site, where we get it fresh from the global instance, so if the global instance changes, we'd pick up the latest one.
However, here I just implemented container tags, but I realized as you were reviewing that I need to add proces tags in there too (like in python), so there will be some computation to do.
There was a problem hiding this comment.
basically, the agent computes a hash that's ready to use based on container tags, but we also need to pack process tags in that hash, which is why there is a re-hashing step needed in the tracer (then)
There was a problem hiding this comment.
Okay 👍
I also noticed we were putting it as a string compared to python, but also noticed that the RFC actually seems to point to that Python PR as being the source of truth so I think that answers both this comment and the one I just left here
There was a problem hiding this comment.
after some investigation and discussion (here), we agreed that storing the hash in b64 was the best compromise, and python will align on that.
I amended the RFC to specify that the value should be in meta (as a tag), and the python code will be updated.
Execution-Time Benchmarks Report ⏱️Execution-time results for samples comparing This PR (8061) and master. ✅ No regressions detected - check the details below Full Metrics ComparisonFakeDbCommand
HttpMessageHandler
Comparison explanationExecution-time benchmarks measure the whole time it takes to execute a program, and are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are highlighted in **red**. The following thresholds were used for comparing the execution times:
Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard. Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph). Duration chartsFakeDbCommand (.NET Framework 4.8)gantt
title Execution time (ms) FakeDbCommand (.NET Framework 4.8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8061) - mean (75ms) : 72, 77
master - mean (75ms) : 73, 77
section Bailout
This PR (8061) - mean (78ms) : 77, 80
master - mean (80ms) : 78, 82
section CallTarget+Inlining+NGEN
This PR (8061) - mean (1,083ms) : 1043, 1122
master - mean (1,083ms) : 1029, 1136
FakeDbCommand (.NET Core 3.1)gantt
title Execution time (ms) FakeDbCommand (.NET Core 3.1)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8061) - mean (115ms) : 112, 119
master - mean (116ms) : 111, 121
section Bailout
This PR (8061) - mean (116ms) : 114, 119
master - mean (117ms) : 115, 119
section CallTarget+Inlining+NGEN
This PR (8061) - mean (773ms) : 714, 832
master - mean (765ms) : 696, 833
FakeDbCommand (.NET 6)gantt
title Execution time (ms) FakeDbCommand (.NET 6)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8061) - mean (104ms) : 101, 106
master - mean (102ms) : 100, 105
section Bailout
This PR (8061) - mean (104ms) : 102, 106
master - mean (105ms) : 102, 107
section CallTarget+Inlining+NGEN
This PR (8061) - mean (758ms) : 699, 817
master - mean (757ms) : 694, 820
FakeDbCommand (.NET 8)gantt
title Execution time (ms) FakeDbCommand (.NET 8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8061) - mean (102ms) : 99, 104
master - mean (101ms) : 98, 104
section Bailout
This PR (8061) - mean (103ms) : 101, 104
master - mean (103ms) : 100, 105
section CallTarget+Inlining+NGEN
This PR (8061) - mean (673ms) : 657, 689
master - mean (679ms) : 659, 700
HttpMessageHandler (.NET Framework 4.8)gantt
title Execution time (ms) HttpMessageHandler (.NET Framework 4.8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8061) - mean (194ms) : 188, 200
master - mean (194ms) : 189, 198
section Bailout
This PR (8061) - mean (197ms) : 195, 200
master - mean (198ms) : 194, 202
section CallTarget+Inlining+NGEN
This PR (8061) - mean (1,160ms) : 1101, 1219
master - mean (1,152ms) : 1098, 1206
HttpMessageHandler (.NET Core 3.1)gantt
title Execution time (ms) HttpMessageHandler (.NET Core 3.1)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8061) - mean (278ms) : 272, 283
master - mean (280ms) : 270, 290
section Bailout
This PR (8061) - mean (280ms) : 273, 286
master - mean (282ms) : 275, 288
section CallTarget+Inlining+NGEN
This PR (8061) - mean (950ms) : 913, 986
master - mean (947ms) : 902, 991
HttpMessageHandler (.NET 6)gantt
title Execution time (ms) HttpMessageHandler (.NET 6)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8061) - mean (271ms) : 266, 276
master - mean (271ms) : 267, 276
section Bailout
This PR (8061) - mean (272ms) : 266, 278
master - mean (273ms) : 266, 279
section CallTarget+Inlining+NGEN
This PR (8061) - mean (936ms) : 907, 964
master - mean (937ms) : 908, 966
HttpMessageHandler (.NET 8)gantt
title Execution time (ms) HttpMessageHandler (.NET 8)
dateFormat x
axisFormat %Q
todayMarker off
section Baseline
This PR (8061) - mean (271ms) : 266, 276
master - mean (270ms) : 264, 276
section Bailout
This PR (8061) - mean (270ms) : 264, 276
master - mean (270ms) : 265, 276
section CallTarget+Inlining+NGEN
This PR (8061) - mean (834ms) : 814, 854
master - mean (833ms) : 807, 859
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8ace897557
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Summary of changes
Add the ability to write the container tags hash to DBM queries + to the related span.
The goal is that DBM would then query the spans bearing that hash, and then use the container tags on this (those) spans(s) to enrich the queries with it.
This is controlled by a setting that is disabled by default, and would be enabled if propagation mode is "service" or greater
see RFC: https://docs.google.com/document/d/15GtNOKGBCt6Dc-HsDNnMmCdZwhewFQx8yUlI9in5n3M
related PR in python: DataDog/dd-trace-py#15293
Reason for change
Implementation details
Test coverage
Adding a test in DbScopeFactoryTests.cs forced me to inject the value from pretty high, which I find a bit "dirty", but at least we don't have to rely on global static instance in tests.
Other details