Throw exception for session-level judge align API at the base judge level instead of built-in judge level by xsh310 · Pull Request #19045 · mlflow/mlflow

xsh310 · 2025-11-26T07:09:37Z

🥞 Stacked PR

Use this link to review incremental changes.

stack/P0_builtin_judges_stack_ML_59674 [Files changed]
- stack/multi_turn_judge_align_handling_fix [Files changed]

What changes are proposed in this pull request?

Currently the NotImplementedError for the align API for session-level judge is raised from the newly created BuiltInScorers. But we also want to raise the exception for session_level judge created via make_judge. This PR moves the exception to the base judge class to cover both BuiltInScorer and InstructionsJudge created via make_judge.

How is this PR tested?

Existing unit/integration tests
New unit/integration tests
Manual tests

Test Plan

Passed new unit test: test_session_level_scorer_alignment_raises_error

Does this PR require documentation update?

Release Notes

Is this a user-facing change?

No. You can skip the rest of this section.
Yes. Give a description of this change to be included in the release notes for MLflow users.

What component(s), interfaces, languages, and integrations does this PR affect?

Components

How should the PR be classified in the release notes? Choose one:

rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
rn/feature - A new user-facing feature worth mentioning in the release notes
rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
rn/documentation - A user-facing documentation change worth mentioning in the release notes

Should this PR be included in the next patch release?

Yes should be selected for bug fixes, documentation updates, and other small changes. No should be selected for new features and larger changes. If you're unsure about the release classification of this PR, leave this unchecked to let the maintainers decide.

What is a minor/patch release?

Minor release: a release that increments the second part of the version number (e.g., 1.2.0 -> 1.3.0).
Bug fixes, doc updates and new features usually go into minor releases.
Patch release: a release that increments the third part of the version number (e.g., 1.2.0 -> 1.2.1).
Bug fixes and doc updates usually go into patch releases.

Yes (this PR will be cherry-picked and included in the next patch release)
No (this PR will be included in the next minor release)

Signed-off-by: Xiang Shen <xshen.shc@gmail.com>

…evel instead of built-in judge level Signed-off-by: Xiang Shen <xshen.shc@gmail.com>

github-actions · 2025-11-26T07:31:55Z

Documentation preview for 24ba45a is available at:

https://pr-19045--mlflow-docs-preview.netlify.app/docs/latest/

More info

Ignore this comment if this PR does not change the documentation.
The preview is updated when a new commit is pushed to this PR.
This comment was created by this workflow run.
The documentation was built by this workflow run.

serena-ruan

LGTM!

This was referenced Nov 26, 2025

[ML-59678] Create new session-level built-in judge ConversationCompleteness #18967

Merged

[ML-59674] Create new single turn built-in judge Completeness #18968

Merged

[ML-59305] Create new session-level builtin judge UserFrustration #18966

Merged

xsh310 requested review from AveshCSingh, B-Step62, alkispoly-db, serena-ruan and smoorjani November 26, 2025 07:17

xsh310 marked this pull request as ready for review November 26, 2025 07:18

xsh310 added 4 commits November 25, 2025 23:22

[ML-59674] Create new single turn built-in judge Completeness

6025a21

Signed-off-by: Xiang Shen <xshen.shc@gmail.com>

[ML-59674] Make single turn completeness judge prompt more concise

ae4bde0

Signed-off-by: Xiang Shen <xshen.shc@gmail.com>

[ML-59674] Fix nits for single-turn Completeness judge

c3d10bb

Signed-off-by: Xiang Shen <xshen.shc@gmail.com>

Throw exception for session-level judge align API at the base judge l…

24ba45a

…evel instead of built-in judge level Signed-off-by: Xiang Shen <xshen.shc@gmail.com>

xsh310 force-pushed the stack/multi_turn_judge_align_handling_fix branch from 26ccac3 to 24ba45a Compare November 26, 2025 07:23

github-actions bot added v3.6.1 area/evaluation MLflow Evaluation rn/none List under Small Changes in Changelogs. labels Nov 26, 2025

smoorjani approved these changes Nov 26, 2025

View reviewed changes

serena-ruan approved these changes Nov 26, 2025

View reviewed changes

xsh310 enabled auto-merge November 26, 2025 08:23

xsh310 added this pull request to the merge queue Nov 26, 2025

Merged via the queue into mlflow:master with commit e3e7b89 Nov 26, 2025
69 of 75 checks passed

xsh310 deleted the stack/multi_turn_judge_align_handling_fix branch November 26, 2025 09:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Throw exception for session-level judge align API at the base judge level instead of built-in judge level#19045

Throw exception for session-level judge align API at the base judge level instead of built-in judge level#19045
xsh310 merged 4 commits intomlflow:masterfrom
xsh310:stack/multi_turn_judge_align_handling_fix

xsh310 commented Nov 26, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 26, 2025

Uh oh!

serena-ruan left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

xsh310 commented Nov 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🥞 Stacked PR

What changes are proposed in this pull request?

How is this PR tested?

Test Plan

Does this PR require documentation update?

Release Notes

Is this a user-facing change?

What component(s), interfaces, languages, and integrations does this PR affect?

How should the PR be classified in the release notes? Choose one:

Should this PR be included in the next patch release?

Uh oh!

github-actions bot commented Nov 26, 2025

Uh oh!

serena-ruan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

xsh310 commented Nov 26, 2025 •

edited

Loading