Skip to content

[DSM] - Fixes for IbmMq instrumentation#5271

Merged
piochelepiotr merged 16 commits intomasterfrom
kr-igor/ibmmq-fixes
Mar 11, 2024
Merged

[DSM] - Fixes for IbmMq instrumentation#5271
piochelepiotr merged 16 commits intomasterfrom
kr-igor/ibmmq-fixes

Conversation

@kr-igor
Copy link
Contributor

@kr-igor kr-igor commented Mar 5, 2024

Summary of changes

Injecting context headers as text may lead to message encoding reset. This may make the message invalid for downstream consumers. This PRs changes the way we store headers from str -> sbyte[] in order to avoid this issue.

Additionally, a new options DD_IBM_MQ_CTX_PROPAGATION_DISABLED is introduced, allowing disabling context propagation for IbmMq integration.

Reason for change

Several issues were raised by customers, including this one.

Test coverage

DD_IBM_MQ_CTX_PROPAGATION_DISABLED is covered by a basic test; no tests for IbmMq, since we can't deploy the infrastructure.

@datadog-ddstaging
Copy link

datadog-ddstaging bot commented Mar 5, 2024

Datadog Report

Branch report: kr-igor/ibmmq-fixes
Commit report: efb4d8f
Test service: dd-trace-dotnet

✅ 0 Failed, 326897 Passed, 1558 Skipped, 39m 18.15s Wall Time

@andrewlock
Copy link
Member

andrewlock commented Mar 5, 2024

Execution-Time Benchmarks Report ⏱️

Execution-time results for samples comparing the following branches/commits:

Execution-time benchmarks measure the whole time it takes to execute a program. And are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are shown in red. The following thresholds were used for comparing the execution times:

  • Welch test with statistical test for significance of 5%
  • Only results indicating a difference greater than 5% and 5 ms are considered.

Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard.

Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph).

gantt
    title Execution time (ms) FakeDbCommand (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (5271) - mean (75ms)  : 64, 85
     .   : milestone, 75,
    master - mean (74ms)  : 66, 83
     .   : milestone, 74,

    section CallTarget+Inlining+NGEN
    This PR (5271) - mean (984ms)  : 960, 1009
     .   : milestone, 984,
    master - mean (996ms)  : 972, 1020
     .   : milestone, 996,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (5271) - mean (110ms)  : 107, 114
     .   : milestone, 110,
    master - mean (111ms)  : 108, 115
     .   : milestone, 111,

    section CallTarget+Inlining+NGEN
    This PR (5271) - mean (716ms)  : 691, 740
     .   : milestone, 716,
    master - mean (717ms)  : 686, 749
     .   : milestone, 717,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (5271) - mean (94ms)  : 91, 97
     .   : milestone, 94,
    master - mean (94ms)  : 91, 97
     .   : milestone, 94,

    section CallTarget+Inlining+NGEN
    This PR (5271) - mean (668ms)  : 644, 693
     .   : milestone, 668,
    master - mean (676ms)  : 641, 712
     .   : milestone, 676,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (5271) - mean (188ms)  : 185, 191
     .   : milestone, 188,
    master - mean (188ms)  : 186, 190
     .   : milestone, 188,

    section CallTarget+Inlining+NGEN
    This PR (5271) - mean (1,067ms)  : 1040, 1093
     .   : milestone, 1067,
    master - mean (1,064ms)  : 1037, 1091
     .   : milestone, 1064,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (5271) - mean (270ms)  : 265, 275
     .   : milestone, 270,
    master - mean (271ms)  : 268, 275
     .   : milestone, 271,

    section CallTarget+Inlining+NGEN
    This PR (5271) - mean (864ms)  : 845, 883
     .   : milestone, 864,
    master - mean (871ms)  : 843, 899
     .   : milestone, 871,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (5271) - mean (260ms)  : 256, 264
     .   : milestone, 260,
    master - mean (261ms)  : 257, 265
     .   : milestone, 261,

    section CallTarget+Inlining+NGEN
    This PR (5271) - mean (852ms)  : 821, 882
     .   : milestone, 852,
    master - mean (857ms)  : 833, 881
     .   : milestone, 857,

Loading

@kr-igor kr-igor marked this pull request as ready for review March 6, 2024 20:37
@kr-igor kr-igor requested review from a team as code owners March 6, 2024 20:37
@andrewlock
Copy link
Member

andrewlock commented Mar 7, 2024

Throughput/Crank Report:zap:

Throughput results for AspNetCoreSimpleController comparing the following branches/commits:

Cases where throughput results for the PR are worse than latest master (5% drop or greater), results are shown in red.

Note that these results are based on a single point-in-time result for each branch. For full results, see one of the many, many dashboards!

gantt
    title Throughput Linux x64 (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (5271) (11.044M)   : 0, 11043544
    master (11.007M)   : 0, 11006674
    benchmarks/2.9.0 (11.269M)   : 0, 11269489

    section Automatic
    This PR (5271) (7.724M)   : 0, 7723902
    master (7.556M)   : 0, 7555579
    benchmarks/2.9.0 (8.237M)   : 0, 8236748

    section Trace stats
    This PR (5271) (7.937M)   : 0, 7936577
    master (7.855M)   : 0, 7855147

    section Manual
    This PR (5271) (9.774M)   : 0, 9773659
    master (9.556M)   : 0, 9555551

    section Manual + Automatic
    This PR (5271) (7.355M)   : 0, 7355341
    master (7.141M)   : 0, 7140945

    section Version Conflict
    This PR (5271) (6.647M)   : 0, 6646837
    master (6.452M)   : 0, 6451971

Loading
gantt
    title Throughput Linux arm64 (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (5271) (9.572M)   : 0, 9571934
    master (9.496M)   : 0, 9495623
    benchmarks/2.9.0 (9.335M)   : 0, 9334561

    section Automatic
    This PR (5271) (6.497M)   : 0, 6496789
    master (6.550M)   : 0, 6550088

    section Trace stats
    This PR (5271) (7.003M)   : 0, 7003398
    master (6.930M)   : 0, 6929825

    section Manual
    This PR (5271) (8.288M)   : 0, 8287678
    master (8.294M)   : 0, 8293659

    section Manual + Automatic
    This PR (5271) (6.290M)   : 0, 6290196
    master (6.274M)   : 0, 6273873

    section Version Conflict
    This PR (5271) (5.630M)   : 0, 5630008
    master (5.748M)   : 0, 5748183

Loading
gantt
    title Throughput Windows x64 (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (5271) (10.421M)   : 0, 10421195
    master (10.120M)   : 0, 10120352
    benchmarks/2.9.0 (10.334M)   : 0, 10333846

    section Automatic
    This PR (5271) (7.361M)   : 0, 7361326
    master (7.343M)   : 0, 7343410
    benchmarks/2.9.0 (7.377M)   : 0, 7377326

    section Trace stats
    This PR (5271) (6.831M)   : crit ,0, 6831367
    master (7.630M)   : 0, 7629966

    section Manual
    This PR (5271) (9.259M)   : 0, 9259021
    master (8.884M)   : 0, 8883719

    section Manual + Automatic
    This PR (5271) (7.088M)   : 0, 7088219
    master (6.989M)   : 0, 6989119

    section Version Conflict
    This PR (5271) (6.379M)   : 0, 6379293
    master (6.322M)   : 0, 6322239

Loading
gantt
    title Throughput Linux x64 (ASM) (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    master (7.454M)   : 0, 7454237
    benchmarks/2.9.0 (7.886M)   : 0, 7885996

    section No attack
    master (1.841M)   : 0, 1841297
    benchmarks/2.9.0 (3.256M)   : 0, 3255652

    section Attack
    master (1.454M)   : 0, 1454025
    benchmarks/2.9.0 (2.490M)   : 0, 2489577

    section Blocking
    master (3.195M)   : 0, 3194705

    section IAST default
    master (6.463M)   : 0, 6462928

    section IAST full
    master (5.595M)   : 0, 5594918

    section Base vuln
    master (0.946M)   : 0, 945501

    section IAST vuln
    master (0.849M)   : 0, 849226

Loading

Copy link
Member

@andrewlock andrewlock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing the IBM bug, but I don't think we should do the context propagation disablement. That's a whole separate feature, which has been discussed separately and shouldn't be done in an ad-hoc way for integrations IMO.

<CopyLocalLockFileAssemblies>true</CopyLocalLockFileAssemblies>
<RootNamespace>Datadog.Trace.Tools.dd_dotnet</RootNamespace>
<AllowUnsafeBlocks>true</AllowUnsafeBlocks>
<ErrorOnDuplicatePublishOutputFiles>false</ErrorOnDuplicatePublishOutputFiles>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😕 I don't know why this would be required, but I don't think we would want this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this build for DDTool fails on MacOS.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this build for DDTool fails on MacOS

Yes, it should - Native AOT isn't supported on macOS yet (by .NET) 🙂

namespace Datadog.Trace.ClrProfiler.AutoInstrumentation.IbmMq;

internal readonly struct IbmMqHeadersAdapter : IHeadersCollection
internal readonly struct IbmMqHeadersAdapter(IMqMessage message) : IHeadersCollection
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't do this - this implicitly captures the variable which changes allocation and layout characteristics. We're adding an analyzer to block this here: #5276

kr-igor and others added 5 commits March 7, 2024 10:37
…/IbmMqHeadersAdapter.cs


Removed debug code

Co-authored-by: Andrew Lock <andrew.lock@datadoghq.com>
…/IbmMqHeadersAdapter.cs


Using Unsafe.As instead of BlockCopy

Co-authored-by: Andrew Lock <andrew.lock@datadoghq.com>
…/IbmMqHeadersAdapter.cs


Use Unsafe.As instead of BlockCopy

Co-authored-by: Andrew Lock <andrew.lock@datadoghq.com>
@piochelepiotr piochelepiotr merged commit 0d511f9 into master Mar 11, 2024
@piochelepiotr piochelepiotr deleted the kr-igor/ibmmq-fixes branch March 11, 2024 14:35
@github-actions github-actions bot added this to the vNext milestone Mar 11, 2024
link04 added a commit that referenced this pull request Mar 12, 2024
commit 832de4b
Author: Flavien Darche <11708575+e-n-0@users.noreply.github.com>
Date:   Tue Mar 12 20:24:21 2024 +0000

    [ASM][IAST] Configure maximum IAST Ranges (#5292)

    * Add configuration key

    * Use a RangeList in some case to not exceed the max number

    * Revert some code + implem correct merge

    * Fix + Add unit and integration tests

    * Usual macos fix for snapshot

    * Fix snapshots hashs

    * Update snapshots (remove other tests as they can't apply different env var values in same run)

    * Apply comment

    * Re-integrate integration tests with multiple processes (new fixture)

    * Add test case for setting MaxRangeCount to zero

commit 83f6ab1
Author: Tony Redondo <tony.redondo@datadoghq.com>
Date:   Tue Mar 12 21:20:39 2024 +0100

    [CI Visibility] - Enable snapshot testing of current testing framework implementations (#5226)

commit 233695a
Author: Daniel Romano <108014683+daniel-romano-DD@users.noreply.github.com>
Date:   Tue Mar 12 17:06:06 2024 +0100

    [IAST] Vulnerability and Evidence truncation (#5302)

    * Initial implementation

    * Updated test bundle

    * Fix test compilation error

    * Fix snapshot (from rebase)

    * Fix typo in config value. Updated tests

    * Fix typo

    * Refactor converters initialization

commit ea31cf5
Author: Anna <anna.yafi@datadoghq.com>
Date:   Tue Mar 12 16:39:09 2024 +0100

    Deactivate benchmark for legacy encoder (#5299)

commit d0d713a
Author: NachoEchevarria <53266532+NachoEchevarria@users.noreply.github.com>
Date:   Tue Mar 12 09:25:27 2024 +0100

    Set big regex timeouts for tests (#5297)

commit d5388d6
Author: Lucas Pimentel <lucas.pimentel@datadoghq.com>
Date:   Mon Mar 11 15:20:58 2024 -0400

    [Tracing] Support configuring `DD_TRACE_ENABLED` remotely (#5181)

    * add support for remote TraceEnabled setting

    * fix unrelated typo

    * add ApmTracingEnabled capability 19

    * add missing RCM capability 18

    * add mapping

    * add unit test

    * add comments to unit test

    * rename property to match RCM constant

    * include config in integration tests

    * fix test json

    * rewrite tests to use raw values instead of strings

commit 2b95f46
Author: Flavien Darche <11708575+e-n-0@users.noreply.github.com>
Date:   Mon Mar 11 17:47:55 2024 +0100

    [ASM][IAST] Support manual JSON deserialisation (Newtonsoft.Json) (#5238)

    * Add Newtonsoft.Json (non -working yet)

    * Refactor the tainting proces + add tests

    * Add the JToken Parse aspect + test

    * Rename Aspects class + Duck orignal method call

    * Add integration test

    * Fix nullability

    * Fix compilation issue for netfx

    * Change JSON formatting in ParseTests

    * Fix a test json format

    * Refactor NewtonsoftJsonAspects to static constructor

commit 0d511f9
Author: Igor Kravchenko <21974069+kr-igor@users.noreply.github.com>
Date:   Mon Mar 11 09:35:23 2024 -0500

    [DSM] - Fixes for IbmMq instrumentation (#5271)

    * Use byte properties instead of strings

    * Fixed nullability files

    * Added some debug info

    * Fixed lint issues

    * Added a bit more logs

    * Using slow byte->sbyte conversion

    * Added noop headers adapter

    * Fixed nullability files

    * Added more logs

    * Cleaned up debug logs

    * Removed symlink

    * Update tracer/src/Datadog.Trace/ClrProfiler/AutoInstrumentation/IbmMq/IbmMqHeadersAdapter.cs

    Removed debug code

    Co-authored-by: Andrew Lock <andrew.lock@datadoghq.com>

    * Update tracer/src/Datadog.Trace/ClrProfiler/AutoInstrumentation/IbmMq/IbmMqHeadersAdapter.cs

    Using Unsafe.As instead of BlockCopy

    Co-authored-by: Andrew Lock <andrew.lock@datadoghq.com>

    * Update tracer/src/Datadog.Trace/ClrProfiler/AutoInstrumentation/IbmMq/IbmMqHeadersAdapter.cs

    Use Unsafe.As instead of BlockCopy

    Co-authored-by: Andrew Lock <andrew.lock@datadoghq.com>

    * Addressed some of the comments

    * Removed context propagation options

    ---------

    Co-authored-by: Andrew Lock <andrew.lock@datadoghq.com>

commit 5684a72
Author: Zach Montoya <zach.montoya@datadoghq.com>
Date:   Fri Mar 8 20:56:30 2024 -0800

    [Tracing] Update instrumentation point for DD_TRACE_DELAY_WCF_INSTRUMENTATION_ENABLED=true (#5206)

    Updates the instrumentation point for `DD_TRACE_DELAY_WCF_INSTRUMENTATION_ENABLED=true` so that now a server span is created immediately before IDispatchMessageInspector implementations are run, so application code can access the root span from inside a IDispatchMessageInspector.AfterReceiveRequest callback.

    This PR also does some cleanup to remove unused Wcf files and it makes the entire Wcf instrumentation use nullable reference types.

commit ca1bb6e
Author: Andrew Lock <andrew.lock@datadoghq.com>
Date:   Fri Mar 8 17:43:57 2024 +0000

    Fix errors identified from telemetry (#5279)

    * Try to avoid MongoDb exception

    We're seeing exceptions like this:
    ```
    System.FieldAccessException
       at REDACTED
       at Datadog.Trace.ClrProfiler.AutoInstrumentation.MongoDb.MongoDbIntegration.CreateScope[TConnection](Object wireProtocol, TConnection connection)
       at REDACTED
       at MongoDB.Driver.Core.WireProtocol.CommandWireProtocol`1.ExecuteAsync(IConnection connection, CancellationToken cancellationToken)
    ```

    and the only explanation I can think of is a duck-chaining issue, so stopped doing duck chaining and being explicit instead

    * Add local functions to try to isolate problems

    * Fix ArgumentNullException in AWS SQS integration
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants