Skip to content

refactor(sampling)!: use sampling from libdatadog [APMSP-3021]#154

Merged
gh-worker-dd-mergequeue-cf854d[bot] merged 3 commits into
mainfrom
ban/sampling-extraction-4
May 21, 2026
Merged

refactor(sampling)!: use sampling from libdatadog [APMSP-3021]#154
gh-worker-dd-mergequeue-cf854d[bot] merged 3 commits into
mainfrom
ban/sampling-extraction-4

Conversation

@bantonsson

@bantonsson bantonsson commented Feb 17, 2026

Copy link
Copy Markdown
Collaborator

Moves the sampling code into a separate crate, and uses the moved sampling code from libdatadog DataDog/libdatadog#1927

These are the benchmark numbers after DataDog/libdatadog#1977 landed.

OTel Sampling Benchmarks vs main

Allocations

Benchmark Allocated Change
rule_all_spans_only_rate 0.732 KB ±0.0% (no change)
service_rule_matching 0.756 KB -92.1% 🟢
service_rule_not_matching 0.434 KB -2.9% 🟢
name_pattern_rule_matching 0.746 KB -92.2% 🟢
name_pattern_rule_not_matching 0.460 KB -7.8% 🟢
resource_pattern_rule_matching 0.502 KB -7.2% 🟢
resource_pattern_rule_not_matching 0.500 KB -6.7% 🟢
tag_rule_matching 1.021 KB -89.9% 🟢
tag_rule_not_matching 0.460 KB -6.5% 🟢
complex_rule_matching 0.477 KB -9.1% 🟢
complex_rule_partial_match 0.470 KB -8.2% 🟢
multiple_rules_first_match 0.756 KB -92.1% 🟢
multiple_rules_last_match 0.758 KB -3.3% 🟢
many_attributes 1.012 KB -90.0% 🟢
parent_sampled_short_circuit ~0 B no change
parent_not_sampled_short_circuit ~0 B no change
unicode_rule_matching 0.496 KB ±0.0% (no change)

Wall Time

Benchmark Time Change
rule_all_spans_only_rate 329 ns -2.9% 🟢
service_rule_matching 461 ns -14.8% 🟢
service_rule_not_matching 244 ns -7.6% 🟢
name_pattern_rule_matching 363 ns -18.0% 🟢
name_pattern_rule_not_matching 177 ns -14.1% 🟢
resource_pattern_rule_matching 252 ns -11.0% 🟢
resource_pattern_rule_not_matching 256 ns -10.7% 🟢
tag_rule_matching 386 ns -17.5% 🟢
tag_rule_not_matching 204 ns -13.8% 🟢
complex_rule_matching 386 ns -10.9% 🟢
complex_rule_partial_match 385 ns -11.1% 🟢
multiple_rules_first_match 376 ns -17.6% 🟢
multiple_rules_last_match 510 ns -6.8% 🟢
many_attributes 392 ns -16.9% 🟢
parent_sampled_short_circuit 9.2 ns ~-0.8% (noise)
parent_not_sampled_short_circuit 9.2 ns ~-0.7% (noise)
unicode_rule_matching 251 ns -1.8% 🟢

@bantonsson bantonsson force-pushed the ban/sampling-extraction-4 branch 2 times, most recently from 61d3be1 to d7f48f3 Compare February 19, 2026 16:14
@bantonsson bantonsson force-pushed the ban/sampling-extraction-3 branch from 6326df5 to ed96d8c Compare March 9, 2026 13:53
@bantonsson bantonsson force-pushed the ban/sampling-extraction-4 branch from d7f48f3 to 90e61b2 Compare March 9, 2026 13:58
@bantonsson bantonsson force-pushed the ban/sampling-extraction-3 branch from ed96d8c to f6d60d3 Compare March 9, 2026 15:34
@bantonsson bantonsson force-pushed the ban/sampling-extraction-4 branch from 90e61b2 to 2a70bfc Compare March 9, 2026 15:40
Comment thread datadog-opentelemetry/src/sampling/otel_mappings.rs
Comment thread libdd-sampling/src/lib.rs Outdated

@ekump ekump left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a couple of non-blocking comments. LGTM

@bantonsson bantonsson force-pushed the ban/sampling-extraction-3 branch from f6d60d3 to 6a653dc Compare March 16, 2026 14:11
@bantonsson bantonsson force-pushed the ban/sampling-extraction-4 branch from 2a70bfc to 5003a05 Compare March 16, 2026 14:56
@bantonsson bantonsson force-pushed the ban/sampling-extraction-3 branch 2 times, most recently from 4235c71 to 2e27e2a Compare March 17, 2026 13:21
@bantonsson bantonsson force-pushed the ban/sampling-extraction-4 branch from 5003a05 to c443b13 Compare March 17, 2026 13:23
Comment thread datadog-opentelemetry/src/core/configuration/configuration.rs Outdated
@bantonsson bantonsson force-pushed the ban/sampling-extraction-3 branch from 2e27e2a to 8338fa2 Compare March 20, 2026 10:54
@bantonsson bantonsson force-pushed the ban/sampling-extraction-4 branch from c443b13 to de4f688 Compare March 20, 2026 16:41
@bantonsson bantonsson force-pushed the ban/sampling-extraction-3 branch 3 times, most recently from 4593286 to f3f7041 Compare March 31, 2026 13:02
Base automatically changed from ban/sampling-extraction-3 to main March 31, 2026 14:26
@bantonsson bantonsson force-pushed the ban/sampling-extraction-4 branch 2 times, most recently from 71bd063 to d5c0f6e Compare April 29, 2026 11:16
@bantonsson bantonsson changed the title [WIP] refactor(sampling): move sampling to its own crate [WIP] refactor(sampling): use sampling from libdatadog [APMSP-3021] May 15, 2026
@bantonsson bantonsson force-pushed the ban/sampling-extraction-4 branch 2 times, most recently from 5c949ee to 9f857ea Compare May 18, 2026 13:43
@bantonsson bantonsson changed the title [WIP] refactor(sampling): use sampling from libdatadog [APMSP-3021] refactor(sampling): use sampling from libdatadog [APMSP-3021] May 18, 2026
@bantonsson bantonsson force-pushed the ban/sampling-extraction-4 branch from 9f857ea to e27e124 Compare May 20, 2026 09:27
@bantonsson bantonsson changed the title refactor(sampling): use sampling from libdatadog [APMSP-3021] refactor(sampling)!: use sampling from libdatadog [APMSP-3021] May 20, 2026
@datadog-prod-us1-6

datadog-prod-us1-6 Bot commented May 20, 2026

Copy link
Copy Markdown

Tests

🎉 All green!

🧪 All tests passed
❄️ No new flaky tests detected

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 0a4e479 | Docs | Datadog PR Page | Give us feedback!

@bantonsson bantonsson marked this pull request as ready for review May 20, 2026 09:29
@bantonsson bantonsson requested a review from a team as a code owner May 20, 2026 09:29
@bantonsson bantonsson force-pushed the ban/sampling-extraction-4 branch from e27e124 to 263d58b Compare May 20, 2026 14:16
@bantonsson

Copy link
Copy Markdown
Collaborator Author

/merge

@gh-worker-devflow-routing-ef8351

gh-worker-devflow-routing-ef8351 Bot commented May 20, 2026

Copy link
Copy Markdown

View all feedbacks in Devflow UI.

2026-05-20 14:16:45 UTC ℹ️ Start processing command /merge


2026-05-20 14:16:51 UTC ℹ️ MergeQueue: waiting for PR to be ready

This pull request is not mergeable according to GitHub. Common reasons include pending required checks, missing approvals, or merge conflicts — but it could also be blocked by other repository rules or settings.
It will be added to the queue as soon as checks pass and/or get approvals. View in MergeQueue UI.
Note: if you pushed new commits since the last approval, you may need additional approval.
You can remove it from the waiting list with /remove command.


2026-05-20 18:25:17 UTC ⚠️ MergeQueue: This merge request was unqueued

devflow unqueued this merge request: It did not become mergeable within the expected time

@iunanua

iunanua commented May 20, 2026

Copy link
Copy Markdown
Collaborator

To prevent crates with unsupported msrv from slipping in, could we define in .cargo/config.toml something like

[resolver]
incompatible-rust-versions = "fallback"

?

@bantonsson bantonsson force-pushed the ban/sampling-extraction-4 branch from 5504211 to 0a4e479 Compare May 21, 2026 09:09
@bantonsson

Copy link
Copy Markdown
Collaborator Author

@iunanua I have no idea. If you know that incompatible-rust-versions = "fallback" is a good idea, please say so. There are too many knobs and levers on cargo.

@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d Bot merged commit 2dc6e67 into main May 21, 2026
35 checks passed
@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d Bot deleted the ban/sampling-extraction-4 branch May 21, 2026 09:26
gh-worker-dd-mergequeue-cf854d Bot pushed a commit that referenced this pull request Jun 1, 2026
)

# What does this PR do?

Fixes adaptive-sampling Remote Config in `datadog-opentelemetry`, rebuilt on top of PR #154. Also adds `DD_TRACE_SAMPLE_RATE` support.

Before this PR, four things were broken:

1. **`tracing_sampling_rate` from RC was ignored.** The handler only acted on `tracing_sampling_rules`; a rate-only payload installed nothing.
2. **RC list-shape `tags` were rejected.** RC encodes `tags` as `[{"key": "env", "value_glob": "prod"}]`, but `libdd_sampling::SamplingRuleConfig::tags` only accepted the map shape, so the parse errored and the whole update was dropped.
3. **Env `DD_TRACE_SAMPLING_RULES` were wiped on every RC update.** `update_sampling_rules_from_remote` does a full override, so any RC change replaced env rules, even when RC only sent a global rate.
4. **`DD_TRACE_SAMPLE_RATE` had no effect.** No binding existed.

# Motivation

End-to-end adaptive sampling didn't work.

# What changed

**Composition.** `ApmTracingHandler::process_config` now follows the multi-source precedence model:

| env rules | env `DD_TRACE_SAMPLE_RATE` | RC `tracing_sampling_rules` | RC `tracing_sampling_rate` | Effective rule chain |
|---|---|---|---|---|
| present | any | absent / null | absent / null | `env_rules` |
| present | unset | absent / null | present | `env_rules + catch_all(rc_rate)` |
| present | set | absent / null | absent / null | `env_rules + catch_all(env_rate)` |
| present | set | absent / null | present | `env_rules + catch_all(rc_rate)` |
| any | any | non-empty array | absent / null | `rc_rules + catch_all(env_rate)` if env_rate set |
| any | any | non-empty array | present | `rc_rules + catch_all(rc_rate)` |

The synthetic catch-all uses libdatadog's default provenance, mapping to DM `-3` (LOCAL_USER). See the "legacy behavior" comment in [test_trace_sampling_rules_override_rate](https://github.com/DataDog/system-tests/blob/main/tests/parametric/test_dynamic_configuration.py#L872).

**`DD_TRACE_SAMPLE_RATE`.** New `Config::trace_sample_rate(): Option<f64>`. When set, the sampler installs an implicit catch-all so `DD_TRACE_RATE_LIMIT` applies. Unset means no catch-all (libdatadog's no-rule path samples at 100%).

**Tag normalization.** RC encodes `tags` as the list shape `[{"key", "value_glob"}]`. This is parsed natively by `libdd-sampling` ≥ 2.1.0 (DataDog/libdatadog#2033), so this PR bumps `libdd-sampling` 1.0.0 → 2.1.0 (pulling `libdd-common` → 4.2.0) and no in-tracer normalization is needed. An earlier revision carried a `normalize_rc_tags` shim for this; it has been removed now that the upstream release is available. Regression coverage: `test_handler_rc_rules_with_list_tags_applied` (list-shape tags apply, tags preserved as a map) and `test_handler_malformed_tags_rejects_update` (malformed list entries still rejected wholesale, not silently broadened).

**Fail-closed behavior.** When libdatadog rejects an update (malformed tags, out-of-range rate), `process_config` returns `Err` so the RC dispatcher reports `apply_state=3` and the prior policy survives. Out-of-range RC `tracing_sampling_rate` (outside `[0.0, 1.0]`) and non-numeric values are rejected up-front.

**Env/code rate validation.** `DD_TRACE_SAMPLE_RATE` (and the programmatic `set_trace_sample_rate`) get the same range check: only finite values in `[0.0, 1.0]` are honored; out-of-range values are logged and treated as unset rather than installed as a catch-all rule that libdatadog would clamp (negative ⇒ drop all, > 1.0 ⇒ keep all).

**Target check.** A config's `service_target` is honored before applying: a payload whose specific (non-`*`) `service`/`env` doesn't match this tracer — primary service or an advertised extra service, compared case-insensitively — is ignored, so a mistargeted RC delivery can never change this service's sampling. Mirrors dd-trace-py/go.

# Additional Notes

- Four parametric tests unblocked by this PR. Companion PR DataDog/system-tests#7007.
- Coordinated with @iunanua's PR #222 (libdatadog RC client wiring). #227 lands first; #222 rebases on top.



Co-authored-by: brian.marks <brian.marks@datadoghq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants