mlflow.genai.evaluate(): handle case where root span is unavailable#19220
mlflow.genai.evaluate(): handle case where root span is unavailable#19220BenWilson2 merged 5 commits intomlflow:masterfrom
Conversation
Add None check before accessing root span inputs/outputs to prevent AttributeError. Warn when traces are missing root spans but keep them in the dataset for downstream processing. Signed-off-by: dbczumar <corey.zumar@databricks.com>
Add None check before accessing root span inputs/outputs to prevent AttributeError. Warn when traces are missing root spans but keep them in the dataset for downstream processing. Signed-off-by: dbczumar <corey.zumar@databricks.com>
| assert transformed_data["inputs"].isna().all() | ||
|
|
||
|
|
||
| def test_convert_to_eval_set_with_missing_root_span(): |
There was a problem hiding this comment.
On master, this fails with:
Traceback (most recent call last):
File "/Users/corey.zumar/mlflowrepos/mlflow4/find_jarvis_issues.py", line 112, in <module>
main()
File "/Users/corey.zumar/mlflowrepos/mlflow4/find_jarvis_issues.py", line 86, in main
issues = find_issues(
File "/Users/corey.zumar/mlflowrepos/mlflow4/insights.py", line 431, in find_issues
mlflow.genai.evaluate(
File "/Users/corey.zumar/mlflowrepos/mlflow4/mlflow/genai/evaluation/base.py", line 248, in evaluate
df = _convert_to_eval_set(data)
File "/Users/corey.zumar/mlflowrepos/mlflow4/mlflow/genai/evaluation/utils.py", line 121, in _convert_to_eval_set
.pipe(_extract_request_response_from_trace)
File "/Users/corey.zumar/miniconda3/envs/mlflow/lib/python3.10/site-packages/pandas/core/generic.py", line 6253, in pipe
return common.pipe(self, func, *args, **kwargs)
File "/Users/corey.zumar/miniconda3/envs/mlflow/lib/python3.10/site-packages/pandas/core/common.py", line 502, in pipe
return func(obj, *args, **kwargs)
File "/Users/corey.zumar/mlflowrepos/mlflow4/mlflow/genai/evaluation/utils.py", line 185, in _extract_request_response_from_trace
df["inputs"] = df["trace"].apply(lambda trace: trace.data._get_root_span().inputs)
File "/Users/corey.zumar/miniconda3/envs/mlflow/lib/python3.10/site-packages/pandas/core/series.py", line 4943, in apply
).apply()
File "/Users/corey.zumar/miniconda3/envs/mlflow/lib/python3.10/site-packages/pandas/core/apply.py", line 1422, in apply
return self.apply_standard()
File "/Users/corey.zumar/miniconda3/envs/mlflow/lib/python3.10/site-packages/pandas/core/apply.py", line 1502, in apply_standard
mapped = obj._map_values(
File "/Users/corey.zumar/miniconda3/envs/mlflow/lib/python3.10/site-packages/pandas/core/base.py", line 925, in _map_values
return algorithms.map_array(arr, mapper, na_action=na_action, convert=convert)
File "/Users/corey.zumar/miniconda3/envs/mlflow/lib/python3.10/site-packages/pandas/core/algorithms.py", line 1743, in map_array
return lib.map_infer(values, mapper, convert=convert)
File "pandas/_libs/lib.pyx", line 2999, in pandas._libs.lib.map_infer
File "/Users/corey.zumar/mlflowrepos/mlflow4/mlflow/genai/evaluation/utils.py", line 185, in <lambda>
df["inputs"] = df["trace"].apply(lambda trace: trace.data._get_root_span().inputs)
AttributeError: 'NoneType' object has no attribute 'inputs'
There was a problem hiding this comment.
Pull request overview
This PR adds defensive handling for cases where the root span is unavailable in mlflow.genai.evaluate(), preventing potential AttributeErrors when traces are fetched without spans (e.g., using search_traces(..., include_spans=False)).
- Introduces a
_safe_extract_from_root_spanhelper function that safely checks for None root spans before accessing attributes - Adds a warning when traces without root spans are detected to inform users about the issue
- Includes comprehensive test coverage for the missing root span scenario
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| mlflow/genai/evaluation/utils.py | Adds _safe_extract_from_root_span helper function to safely handle None root spans and logs warnings when traces lack root spans |
| tests/genai/evaluate/test_utils.py | Adds test case test_convert_to_eval_set_with_missing_root_span to verify correct handling of traces without root spans |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
mlflow/genai/evaluation/utils.py
Outdated
| ) | ||
|
|
||
| # Warn about traces that don't have a root span (where inputs/outputs are None) | ||
| missing_count = df[["inputs", "outputs"]].isna().any(axis=1).sum() |
There was a problem hiding this comment.
The warning logic counts rows where either inputs or outputs is None, but the warning message suggests it's specifically about missing root spans. When a root span is missing, both inputs and outputs will be None. Consider using .all(axis=1) instead of .any(axis=1) to only count rows where both are None, which would more accurately identify traces without root spans:
missing_count = df[["inputs", "outputs"]].isna().all(axis=1).sum()This would avoid false positives where a trace has a root span but only one of inputs or outputs is None.
| missing_count = df[["inputs", "outputs"]].isna().any(axis=1).sum() | |
| missing_count = df[["inputs", "outputs"]].isna().all(axis=1).sum() |
|
Documentation preview for bc7ea8a is available at: More info
|
Add None check before accessing root span inputs/outputs to prevent AttributeError. Warn when traces are missing root spans but keep them in the dataset for downstream processing. Signed-off-by: dbczumar <corey.zumar@databricks.com>
…lflow#19220) Signed-off-by: dbczumar <corey.zumar@databricks.com>
…19220) Signed-off-by: dbczumar <corey.zumar@databricks.com>
🛠 DevTools 🛠
Install mlflow from this PR
For Databricks, use the following command:
Related Issues/PRs
#xxxWhat changes are proposed in this pull request?
mlflow.genai.evaluate(): handle case where root span is unavailable. This can happen for partial third-party OTel traces or if the user calls
search_traces(include_spans=False)and then tries to pass the resulting traces to eval.How is this PR tested?
Does this PR require documentation update?
Release Notes
Is this a user-facing change?
What component(s), interfaces, languages, and integrations does this PR affect?
Components
area/tracking: Tracking Service, tracking client APIs, autologgingarea/models: MLmodel format, model serialization/deserialization, flavorsarea/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registryarea/scoring: MLflow Model server, model deployment tools, Spark UDFsarea/evaluation: MLflow model evaluation features, evaluation metrics, and evaluation workflowsarea/gateway: MLflow AI Gateway client APIs, server, and third-party integrationsarea/prompts: MLflow prompt engineering features, prompt templates, and prompt managementarea/tracing: MLflow Tracing features, tracing APIs, and LLM tracing functionalityarea/projects: MLproject format, project running backendsarea/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev serverarea/build: Build and test infrastructure for MLflowarea/docs: MLflow documentation pagesHow should the PR be classified in the release notes? Choose one:
rn/none- No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" sectionrn/breaking-change- The PR will be mentioned in the "Breaking Changes" sectionrn/feature- A new user-facing feature worth mentioning in the release notesrn/bug-fix- A user-facing bug fix worth mentioning in the release notesrn/documentation- A user-facing documentation change worth mentioning in the release notesShould this PR be included in the next patch release?
Yesshould be selected for bug fixes, documentation updates, and other small changes.Noshould be selected for new features and larger changes. If you're unsure about the release classification of this PR, leave this unchecked to let the maintainers decide.What is a minor/patch release?
Bug fixes, doc updates and new features usually go into minor releases.
Bug fixes and doc updates usually go into patch releases.