refactor(sampling)!: use sampling from libdatadog [APMSP-3021]#154
Conversation
61d3be1 to
d7f48f3
Compare
6326df5 to
ed96d8c
Compare
d7f48f3 to
90e61b2
Compare
ed96d8c to
f6d60d3
Compare
90e61b2 to
2a70bfc
Compare
ekump
left a comment
There was a problem hiding this comment.
a couple of non-blocking comments. LGTM
f6d60d3 to
6a653dc
Compare
2a70bfc to
5003a05
Compare
4235c71 to
2e27e2a
Compare
5003a05 to
c443b13
Compare
2e27e2a to
8338fa2
Compare
c443b13 to
de4f688
Compare
4593286 to
f3f7041
Compare
71bd063 to
d5c0f6e
Compare
5c949ee to
9f857ea
Compare
9f857ea to
e27e124
Compare
🎉 All green!🧪 All tests passed 🔗 Commit SHA: 0a4e479 | Docs | Datadog PR Page | Give us feedback! |
e27e124 to
263d58b
Compare
|
/merge |
|
View all feedbacks in Devflow UI.
This pull request is not mergeable according to GitHub. Common reasons include pending required checks, missing approvals, or merge conflicts — but it could also be blocked by other repository rules or settings.
devflow unqueued this merge request: It did not become mergeable within the expected time |
263d58b to
5504211
Compare
|
To prevent crates with unsupported msrv from slipping in, could we define in ? |
5504211 to
0a4e479
Compare
|
@iunanua I have no idea. If you know that |
) # What does this PR do? Fixes adaptive-sampling Remote Config in `datadog-opentelemetry`, rebuilt on top of PR #154. Also adds `DD_TRACE_SAMPLE_RATE` support. Before this PR, four things were broken: 1. **`tracing_sampling_rate` from RC was ignored.** The handler only acted on `tracing_sampling_rules`; a rate-only payload installed nothing. 2. **RC list-shape `tags` were rejected.** RC encodes `tags` as `[{"key": "env", "value_glob": "prod"}]`, but `libdd_sampling::SamplingRuleConfig::tags` only accepted the map shape, so the parse errored and the whole update was dropped. 3. **Env `DD_TRACE_SAMPLING_RULES` were wiped on every RC update.** `update_sampling_rules_from_remote` does a full override, so any RC change replaced env rules, even when RC only sent a global rate. 4. **`DD_TRACE_SAMPLE_RATE` had no effect.** No binding existed. # Motivation End-to-end adaptive sampling didn't work. # What changed **Composition.** `ApmTracingHandler::process_config` now follows the multi-source precedence model: | env rules | env `DD_TRACE_SAMPLE_RATE` | RC `tracing_sampling_rules` | RC `tracing_sampling_rate` | Effective rule chain | |---|---|---|---|---| | present | any | absent / null | absent / null | `env_rules` | | present | unset | absent / null | present | `env_rules + catch_all(rc_rate)` | | present | set | absent / null | absent / null | `env_rules + catch_all(env_rate)` | | present | set | absent / null | present | `env_rules + catch_all(rc_rate)` | | any | any | non-empty array | absent / null | `rc_rules + catch_all(env_rate)` if env_rate set | | any | any | non-empty array | present | `rc_rules + catch_all(rc_rate)` | The synthetic catch-all uses libdatadog's default provenance, mapping to DM `-3` (LOCAL_USER). See the "legacy behavior" comment in [test_trace_sampling_rules_override_rate](https://github.com/DataDog/system-tests/blob/main/tests/parametric/test_dynamic_configuration.py#L872). **`DD_TRACE_SAMPLE_RATE`.** New `Config::trace_sample_rate(): Option<f64>`. When set, the sampler installs an implicit catch-all so `DD_TRACE_RATE_LIMIT` applies. Unset means no catch-all (libdatadog's no-rule path samples at 100%). **Tag normalization.** RC encodes `tags` as the list shape `[{"key", "value_glob"}]`. This is parsed natively by `libdd-sampling` ≥ 2.1.0 (DataDog/libdatadog#2033), so this PR bumps `libdd-sampling` 1.0.0 → 2.1.0 (pulling `libdd-common` → 4.2.0) and no in-tracer normalization is needed. An earlier revision carried a `normalize_rc_tags` shim for this; it has been removed now that the upstream release is available. Regression coverage: `test_handler_rc_rules_with_list_tags_applied` (list-shape tags apply, tags preserved as a map) and `test_handler_malformed_tags_rejects_update` (malformed list entries still rejected wholesale, not silently broadened). **Fail-closed behavior.** When libdatadog rejects an update (malformed tags, out-of-range rate), `process_config` returns `Err` so the RC dispatcher reports `apply_state=3` and the prior policy survives. Out-of-range RC `tracing_sampling_rate` (outside `[0.0, 1.0]`) and non-numeric values are rejected up-front. **Env/code rate validation.** `DD_TRACE_SAMPLE_RATE` (and the programmatic `set_trace_sample_rate`) get the same range check: only finite values in `[0.0, 1.0]` are honored; out-of-range values are logged and treated as unset rather than installed as a catch-all rule that libdatadog would clamp (negative ⇒ drop all, > 1.0 ⇒ keep all). **Target check.** A config's `service_target` is honored before applying: a payload whose specific (non-`*`) `service`/`env` doesn't match this tracer — primary service or an advertised extra service, compared case-insensitively — is ignored, so a mistargeted RC delivery can never change this service's sampling. Mirrors dd-trace-py/go. # Additional Notes - Four parametric tests unblocked by this PR. Companion PR DataDog/system-tests#7007. - Coordinated with @iunanua's PR #222 (libdatadog RC client wiring). #227 lands first; #222 rebases on top. Co-authored-by: brian.marks <brian.marks@datadoghq.com>
Moves the sampling code into a separate crate, and uses the moved sampling code from
libdatadogDataDog/libdatadog#1927These are the benchmark numbers after DataDog/libdatadog#1977 landed.
OTel Sampling Benchmarks vs
mainAllocations
rule_all_spans_only_rateservice_rule_matchingservice_rule_not_matchingname_pattern_rule_matchingname_pattern_rule_not_matchingresource_pattern_rule_matchingresource_pattern_rule_not_matchingtag_rule_matchingtag_rule_not_matchingcomplex_rule_matchingcomplex_rule_partial_matchmultiple_rules_first_matchmultiple_rules_last_matchmany_attributesparent_sampled_short_circuitparent_not_sampled_short_circuitunicode_rule_matchingWall Time
rule_all_spans_only_rateservice_rule_matchingservice_rule_not_matchingname_pattern_rule_matchingname_pattern_rule_not_matchingresource_pattern_rule_matchingresource_pattern_rule_not_matchingtag_rule_matchingtag_rule_not_matchingcomplex_rule_matchingcomplex_rule_partial_matchmultiple_rules_first_matchmultiple_rules_last_matchmany_attributesparent_sampled_short_circuitparent_not_sampled_short_circuitunicode_rule_matching