[ML-59816][[2/n] Add new fields to the genai_evaluation event telemetry by xsh310 · Pull Request #19040 · mlflow/mlflow

xsh310 · 2025-11-25T23:39:55Z

🥞 Stacked PR

Use this link to review incremental changes.

stack/ML-59816 [Files changed]
- stack/ML-59816-part-2 [Files changed]

Related Issues/PRs

#xxx

What changes are proposed in this pull request?

How is this PR tested?

Existing unit/integration tests
New unit/integration tests
Manual tests

Does this PR require documentation update?

Release Notes

Is this a user-facing change?

No. You can skip the rest of this section.
Yes. Give a description of this change to be included in the release notes for MLflow users.

What component(s), interfaces, languages, and integrations does this PR affect?

Components

How should the PR be classified in the release notes? Choose one:

rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
rn/feature - A new user-facing feature worth mentioning in the release notes
rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
rn/documentation - A user-facing documentation change worth mentioning in the release notes

Should this PR be included in the next patch release?

Yes should be selected for bug fixes, documentation updates, and other small changes. No should be selected for new features and larger changes. If you're unsure about the release classification of this PR, leave this unchecked to let the maintainers decide.

What is a minor/patch release?

Minor release: a release that increments the second part of the version number (e.g., 1.2.0 -> 1.3.0).
Bug fixes, doc updates and new features usually go into minor releases.
Patch release: a release that increments the third part of the version number (e.g., 1.2.0 -> 1.2.1).
Bug fixes and doc updates usually go into patch releases.

Yes (this PR will be cherry-picked and included in the next patch release)
No (this PR will be included in the next minor release)

Signed-off-by: Xiang Shen <xshen.shc@gmail.com>

…_evaluate event logging Signed-off-by: Xiang Shen <xshen.shc@gmail.com>

…uate event logging Signed-off-by: Xiang Shen <xshen.shc@gmail.com>

…t logging Signed-off-by: Xiang Shen <xshen.shc@gmail.com>

xsh310 · 2025-11-26T07:50:27Z

mlflow/genai/evaluation/base.py

        This function is not thread-safe. Please do not use it in multi-threaded
        environments.
    """
-    is_managed_dataset = isinstance(data, (EvaluationDataset, EntityEvaluationDataset))


#19018 (comment)

It's not efficient to convert data to df twice (line 246 and inside _run_harness), can you refactor?

@serena-ruan Do you think possible that I move all the logic in the evaluate function before the _run_harness into _run_harness?

I want to make sure that

data = data if is_managed_dataset else df

happens in _run_harness so that we can preserve the original input type at the event logging.

I move all the logic in the evaluate function before the _run_harness into _run_harness?

Yes you can do that, I don't think there's any difference

@serena-ruan I think one potential downside of moving the logic into _run_harness is that there might be a sudden increase in the genai_evaluate event volume since some of the invalidate that are early returned now logs the genai_evaluate event.

xsh310 · 2025-11-26T07:54:39Z

mlflow/genai/evaluation/utils.py

+            # Check if records are loaded to avoid triggering expensive load
+            if data.has_records():
+                df = data.to_df()
+                eval_data_size = len(df)


#19018 (comment)

Apparently this is too heavy to compute the dataset size by converting it to dataframe. Reading the code, this data is converted to pandas dataframe when evaluating anyways, can we calculate the dataset size after that?

@serena-ruan, I'm not sure whether my understanding is correct but I don't think we will have access to the pd df since the dataset is converted in the middle of _run_harness and the event logging only have access to the input param of _run_harness?

I think one workaround is to add the df as another return type of _run_harness and use parse_result to add these fields with the df

You don't have to use @record_usage_event on the function, you can use _record_event to add record with parameters directly :) The decorator is just a simplified approach for general purposes.

We can collect all needed parameters within _run_harness and avoid duplicate data size computation, then send records once. While _record_event doesn't handle function exception like record_usage_event does, feel free to update it with try finally if necessary.

@serena-ruan Thanks for the explanation! A quick question tho, since from the logging spec, the goal is to log the new field into genai_evaluate, if we add another _record_event call in the middle of _run_harness, will it create duplicate genai_evaluate event, or is there a way we can merge the params with the event created by @record_usage_event?

You should use either the decorator or _record_event but not both, otherwise there will be two events created :)

This was referenced Nov 25, 2025

[ML-59816][1/n] Add new fields to the genai_evaluation event telemetry #19018

Merged

[ML-59815] Use class name instead of scorer name in genai_evaluate telemetry event #18990

Merged

xsh310 force-pushed the stack/ML-59816-part-2 branch from 59ba0d5 to 63a5a08 Compare November 25, 2025 23:58

xsh310 changed the title ~~[ML-59816] Add eval_data_size and eval_data_type fields to genai_evaluate event logging~~ [ML-59816][[2/n] Add new fields to the genai_evaluation event telemetry Nov 26, 2025

xsh310 added 4 commits November 25, 2025 21:23

[ML-59816] Add predict_fn_provided field to genai_evaluate event logging

e504fba

Signed-off-by: Xiang Shen <xshen.shc@gmail.com>

[ML-59816] Update kind assignments and add scorer_kind_count to genai…

d605f37

…_evaluate event logging Signed-off-by: Xiang Shen <xshen.shc@gmail.com>

[ML-59816] Add eval_data_size and eval_data_type fields to genai_eval…

32570a3

…uate event logging Signed-off-by: Xiang Shen <xshen.shc@gmail.com>

[ML-59816] Add eval_data_provided_fields field to genai_evaluate even…

6f9390c

…t logging Signed-off-by: Xiang Shen <xshen.shc@gmail.com>

xsh310 force-pushed the stack/ML-59816-part-2 branch from 63a5a08 to 6f9390c Compare November 26, 2025 05:59

xsh310 requested a review from serena-ruan November 26, 2025 07:45

xsh310 commented Nov 26, 2025

View reviewed changes

xsh310 closed this Dec 4, 2025

xsh310 mentioned this pull request Dec 5, 2025

[ML-59918] Add input data fields to genai_evaluate telemetry event #19230

Merged

29 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ML-59816][[2/n] Add new fields to the genai_evaluation event telemetry#19040

[ML-59816][[2/n] Add new fields to the genai_evaluation event telemetry#19040
xsh310 wants to merge 4 commits intomlflow:masterfrom
xsh310:stack/ML-59816-part-2

xsh310 commented Nov 25, 2025 •

edited

Loading

Uh oh!

xsh310 Nov 26, 2025

Uh oh!

serena-ruan Nov 26, 2025

Uh oh!

xsh310 Dec 2, 2025

Uh oh!

xsh310 Nov 26, 2025

Uh oh!

serena-ruan Nov 26, 2025

Uh oh!

xsh310 Dec 2, 2025

Uh oh!

serena-ruan Dec 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

xsh310 commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🥞 Stacked PR

Related Issues/PRs

What changes are proposed in this pull request?

How is this PR tested?

Does this PR require documentation update?

Release Notes

Is this a user-facing change?

What component(s), interfaces, languages, and integrations does this PR affect?

How should the PR be classified in the release notes? Choose one:

Should this PR be included in the next patch release?

Uh oh!

xsh310 Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

serena-ruan Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

xsh310 Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

xsh310 Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

serena-ruan Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

xsh310 Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

serena-ruan Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

xsh310 commented Nov 25, 2025 •

edited

Loading