Add per-decorator `sampling_ratio_override` parameter to `@mlflow.trace` by harupy · Pull Request #19784 · mlflow/mlflow

harupy · 2026-01-07T03:34:27Z

🛠 DevTools 🛠

Install mlflow from this PR

# mlflow
pip install git+https://github.com/mlflow/mlflow.git@refs/pull/19784/merge
# mlflow-skinny
pip install git+https://github.com/mlflow/mlflow.git@refs/pull/19784/merge#subdirectory=libs/skinny

For Databricks, use the following command:

%sh curl -LsSf https://raw.githubusercontent.com/mlflow/mlflow/HEAD/dev/install-skinny.sh | sh -s pull/19784/merge

Related Issues/PRs

#xxx

What changes are proposed in this pull request?

Add a sampling_ratio_override parameter to the @mlflow.trace decorator, allowing per-function control over trace sampling that overrides the global MLFLOW_TRACE_SAMPLING_RATIO setting.

@mlflow.trace(sampling_ratio_override=0.1)  # 10% sampling for high-volume
def high_volume_endpoint():
    ...

@mlflow.trace(sampling_ratio_override=1.0)  # 100% sampling for critical
def critical_transaction():
    ...

How it works:

A custom OTel sampler (_MlflowSampler) wraps TraceIdRatioBased and checks a _SAMPLING_RATIO_OVERRIDE context variable. The decorator sets this context var before span creation, and the sampler uses it instead of the default ratio. This keeps all sampling logic inside the OTel sampler and leverages OTel's native parent-child sampling inheritance.

Key behaviors:

sampling_ratio_override overrides the global MLFLOW_TRACE_SAMPLING_RATIO for that function
sampling_ratio_override=None (default) defers to the global sampler
Must be a float between 0.0 and 1.0

How is this PR tested?

Unit tests covering validation, sampling at 0%/50%/100%, nested functions, global override (subprocess), generators, and async functions

Does this PR require documentation update?

Release Notes

Is this a user-facing change?

Yes. Give a description of this change to be included in the release notes for MLflow users.

Add sampling_ratio_override parameter to @mlflow.trace decorator for per-function sampling control.

What component(s), interfaces, languages, and integrations does this PR affect?

Components

area/tracing: MLflow Tracing features, tracing APIs, and LLM tracing functionality

How should the PR be classified in the release notes? Choose one:

rn/feature - A new user-facing feature worth mentioning in the release notes

Should this PR be included in the next patch release?

Yes (this PR will be cherry-picked and included in the next patch release)
No (this PR will be included in the next minor release)

🤖 Generated with Claude Code

Copilot

Pull request overview

This PR adds a sampling_ratio parameter to the @mlflow.trace decorator, enabling per-function control over trace sampling that overrides the global MLFLOW_TRACE_SAMPLING_RATIO environment variable. The feature allows different functions to have different sampling rates (e.g., 10% for high-volume endpoints, 100% for critical transactions).

Key changes:

Added sampling_ratio parameter to decorator and internal wrapping functions
Implemented _should_sample() function to determine sampling decisions
Added context variable _TRACE_SAMPLING_SKIPPED to propagate sampling decisions to nested calls
Created manual test script with comprehensive test scenarios

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File	Description
test_sampling_ratio.py	Manual test script demonstrating the new sampling_ratio feature with various scenarios (marked for removal before merging)
mlflow/tracing/fluent.py	Core implementation: adds sampling_ratio parameter to @mlflow.trace decorator, implements sampling decision logic, and propagates sampling state to nested calls

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

test_sampling_ratio.py

mlflow/tracing/fluent.py

github-actions · 2026-01-07T03:43:37Z

Documentation preview for df7908a is available at:

https://pr-19784--mlflow-docs-preview.netlify.app/docs/latest/

Changed Pages (1)

genai/tracing/prod-tracing (modified)

More info

Ignore this comment if this PR does not change the documentation.
The preview is updated when a new commit is pushed to this PR.
This comment was created by this workflow run.
The documentation was built by this workflow run.

test_sampling_ratio.py

mlflow/tracing/fluent.py

test_sampling_ratio.py

harupy · 2026-01-27T12:01:35Z

mlflow/tracing/provider.py

+_FORCE_SAMPLE = contextvars.ContextVar("force_sample", default=False)
+
+
+class _OverridableSampler(Sampler):


This is hacky, but could not find any other approach to bypass the global sampling via MLFLOW_TRACE_SAMPLING_RATIO

Does the user request supporting both global and endpoint-level sampling at the same time? Otherwise we can consider block using both together and simplify the logic?

Good point. Let me ask the reporter

Add `sampling_ratio` parameter to the `@mlflow.trace` decorator to allow per-function control over trace sampling, overriding the global `MLFLOW_TRACE_SAMPLING_RATIO` setting. Key behaviors: - Decorator value overrides global env var - Child spans follow parent's sampling decision - `sampling_ratio=None` (default) defers to global sampler Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

Move manual tests from test_sampling_ratio.py to the proper test location and convert them to pytest format following MLflow test conventions. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

Document the new sampling_ratio parameter on @mlflow.trace decorator in the production tracing docs, including use cases and key behaviors. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

harupy · 2026-02-02T07:45:40Z

Tagging a few more folks to have more eyes

harupy · 2026-02-02T09:48:06Z

/resolve address comments from @serena-ruan

🚀 Resolve workflow started

✅ Workflow completed successfully.

Claude Output

Perfect! I've successfully addressed all the PR review comments from @serena-ruan:

Summary of Changes

Renamed _force_sampler to _force_sample (line 82)
- Updated the function name for better consistency
- Updated all 4 call sites throughout the file
Renamed _should_sample to _should_skip_sample and inverted the logic (line 260)
- Changed the function name to better align with _TRACE_SAMPLING_SKIPPED and _skip_tracing()
- Inverted all boolean logic for consistency:
  - return False → return True (when skipping)
  - return True → return False (when not skipping)
  - random.random() < sampling_ratio → random.random() >= sampling_ratio
- Updated all 4 call sites to remove the not operator

The changes have been committed with DCO sign-off and all pre-commit hooks passed successfully. The commit is ready but has not been pushed as per the instructions.

🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: mlflow-app[bot] <mlflow-app[bot]@users.noreply.github.com>

Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

The test_trace_decorator_sampling_ratio_overrides_global test runs a subprocess that imports full mlflow, which is not available when only the lightweight mlflow-tracing SDK is installed. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

mlflow/tracing/fluent.py

tests/tracing/test_fluent.py

Copilot · 2026-02-03T10:17:52Z

mlflow/tracing/sampling.py

+    def get_description(self) -> str:
+        return f"OverridableSampler(ratio={self._ratio_sampler.rate})"


_OverridableSampler.get_description() reaches into the underlying TraceIdRatioBased instance via self._ratio_sampler.rate. This depends on a non-public attribute/property that may not exist across opentelemetry-sdk versions and can raise at runtime when OTel calls get_description(). Prefer storing the ratio in _OverridableSampler at init time or delegating to self._ratio_sampler.get_description() to avoid relying on internal fields.

docs/docs/genai/tracing/prod-tracing.mdx

serena-ruan

LGTM! We can discuss the documentation expression offline :)

B-Step62

LGTM, #19784 (comment) is not a blocker, we can follow-up if they told us they don't need both.

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

- Fix bug where nested @mlflow.trace could re-sample when parent was not sampled by global sampler (NonRecordingSpan case) - Store ratio in _OverridableSampler to avoid accessing internal OTel attribute (_ratio_sampler.rate) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

tests/tracing/test_fluent.py

mlflow/tracing/fluent.py

tests/tracing/test_fluent.py

Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

…io_override Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

Replace the previous approach (_FORCE_SAMPLE, _skip_tracing, _should_skip_sample, random.random()) with a cleaner design where the _MlflowSampler reads a _SAMPLING_RATIO_OVERRIDE context var and delegates to TraceIdRatioBased. The decorator just sets the context var and lets OTel handle sampling decisions natively. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

…ted first Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

harupy marked this pull request as ready for review January 7, 2026 03:34

Copilot AI review requested due to automatic review settings January 7, 2026 03:34

github-actions bot added area/tracing MLflow Tracing and its integrations rn/feature Mention under Features in Changelogs. labels Jan 7, 2026

harupy marked this pull request as draft January 7, 2026 03:35

Copilot started reviewing on behalf of harupy January 7, 2026 03:35 View session

Copilot AI reviewed Jan 7, 2026

View reviewed changes

B-Step62 added the v3.9.0 label Jan 14, 2026

harupy removed the v3.9.0 label Jan 15, 2026

harupy marked this pull request as ready for review January 15, 2026 06:16