Skip to content

get_trace support for stores#18556

Merged
serena-ruan merged 9 commits intomlflow:masterfrom
serena-ruan:stack/get_trace_stores
Nov 19, 2025
Merged

get_trace support for stores#18556
serena-ruan merged 9 commits intomlflow:masterfrom
serena-ruan:stack/get_trace_stores

Conversation

@serena-ruan
Copy link
Collaborator

@serena-ruan serena-ruan commented Oct 28, 2025

🥞 Stacked PR

Use this link to review incremental changes.


Related Issues/PRs

#xxx

What changes are proposed in this pull request?

How is this PR tested?

  • Existing unit/integration tests
  • New unit/integration tests
  • Manual tests

Does this PR require documentation update?

  • No. You can skip the rest of this section.
  • Yes. I've updated:
    • Examples
    • API references
    • Instructions

Release Notes

Is this a user-facing change?

  • No. You can skip the rest of this section.
  • Yes. Give a description of this change to be included in the release notes for MLflow users.

What component(s), interfaces, languages, and integrations does this PR affect?

Components

  • area/tracking: Tracking Service, tracking client APIs, autologging
  • area/models: MLmodel format, model serialization/deserialization, flavors
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • area/evaluation: MLflow model evaluation features, evaluation metrics, and evaluation workflows
  • area/gateway: MLflow AI Gateway client APIs, server, and third-party integrations
  • area/prompts: MLflow prompt engineering features, prompt templates, and prompt management
  • area/tracing: MLflow Tracing features, tracing APIs, and LLM tracing functionality
  • area/projects: MLproject format, project running backends
  • area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • area/build: Build and test infrastructure for MLflow
  • area/docs: MLflow documentation pages

How should the PR be classified in the release notes? Choose one:

  • rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
  • rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
  • rn/feature - A new user-facing feature worth mentioning in the release notes
  • rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
  • rn/documentation - A user-facing documentation change worth mentioning in the release notes

Should this PR be included in the next patch release?

Yes should be selected for bug fixes, documentation updates, and other small changes. No should be selected for new features and larger changes. If you're unsure about the release classification of this PR, leave this unchecked to let the maintainers decide.

What is a minor/patch release?
  • Minor release: a release that increments the second part of the version number (e.g., 1.2.0 -> 1.3.0).
    Bug fixes, doc updates and new features usually go into minor releases.
  • Patch release: a release that increments the third part of the version number (e.g., 1.2.0 -> 1.2.1).
    Bug fixes and doc updates usually go into patch releases.
  • Yes (this PR will be cherry-picked and included in the next patch release)
  • No (this PR will be included in the next minor release)

@serena-ruan serena-ruan mentioned this pull request Oct 28, 2025
29 tasks
@serena-ruan serena-ruan force-pushed the stack/get_trace_stores branch 2 times, most recently from 6743be6 to 51e815f Compare November 10, 2025 06:40
@serena-ruan serena-ruan marked this pull request as ready for review November 10, 2025 06:43
@github-actions github-actions bot added the rn/none List under Small Changes in Changelogs. label Nov 10, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Nov 10, 2025

Documentation preview for 5d8cc32 is available at:

More info
  • Ignore this comment if this PR does not change the documentation.
  • The preview is updated when a new commit is pushed to this PR.
  • This comment was created by this workflow run.
  • The documentation was built by this workflow run.

@serena-ruan serena-ruan force-pushed the stack/get_trace_stores branch 2 times, most recently from 41211da to e023a53 Compare November 11, 2025 10:22
@serena-ruan serena-ruan mentioned this pull request Nov 11, 2025
29 tasks
// Whether to allow partial traces. Default to True.
optional bool allow_partial = 2 [default = true];;
// Whether to allow partial traces. Default to False.
optional bool allow_partial = 2 [default = false];;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating the default value to False since after second thought I think that the only case that users may want to fetch in-progress trace is through MLflow UI. There's no specific reason user may want to get in-progress traces using SDK, and mlflow.get_trace is depended by lots of places like eval so we should retry to fetch the trace by default.

@harupy
Copy link
Member

harupy commented Nov 12, 2025

Running into a network issue. Cannot join the standup.

@serena-ruan
Copy link
Collaborator Author

serena-ruan commented Nov 12, 2025

Running into a network issue. Cannot join the standup

@harupy Haha we're not having standup today, you can leave async updates in slack afterwards

Signed-off-by: Serena Ruan <serena.rxy@gmail.com>
Signed-off-by: Serena Ruan <serena.rxy@gmail.com>
Signed-off-by: Serena Ruan <serena.rxy@gmail.com>
@serena-ruan serena-ruan force-pushed the stack/get_trace_stores branch from 7edfb9e to f074bcd Compare November 13, 2025 03:38
Signed-off-by: Serena Ruan <serena.rxy@gmail.com>
Signed-off-by: Serena Ruan <serena.rxy@gmail.com>
@harupy
Copy link
Member

harupy commented Nov 13, 2025

/review


🚀 Review workflow started


✅ Review completed.

Review Output

PR Review Complete

I've reviewed PR #18556 "get_trace support for stores" and found 6 issues that need attention:

Critical Issues (Bugs):

  1. Missing time import in sqlalchemy_store.py (line 3548) - This will cause a runtime NameError

  2. Incorrect return type annotation (sqlalchemy_store.py:3542) - The function is annotated as Trace | None but should be Trace since it raises an exception instead of returning None for missing traces

  3. Potential bug with None return (sqlalchemy_store.py:3542-3550) - When allow_partial=False and the trace isn't fully exported after retries, the function returns None, which will cause an exception when the handler tries to call .to_proto() on it

Style Guide Violations:

  1. Try-catch scope too broad (handlers.py:3167-3169) - The safe operation trace_data = trace.data.to_dict() should be moved outside the try block or into an else block

  2. Try-catch scope too broad (handlers.py:3176-3183) - Similar issue with exception handling scope

  3. Mock setup pattern (test_handlers.py:2006+) - While this is a minor style preference and may not be applicable due to fixture usage, the guide recommends setting return_value in the patch declaration when possible

The most critical issues are #1 (missing import) and #3 (None return bug) as they will cause runtime failures. I've added detailed review comments with suggestions for each issue on the PR.

Signed-off-by: Serena Ruan <serena.rxy@gmail.com>
Signed-off-by: Serena Ruan <serena.rxy@gmail.com>
@serena-ruan serena-ruan requested a review from harupy November 18, 2025 07:05
Copy link
Member

@harupy harupy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few more comments, otherwise LGTM :)

Signed-off-by: Serena Ruan <serena.rxy@gmail.com>
Signed-off-by: Serena Ruan <serena.rxy@gmail.com>
Comment on lines +3167 to +3187
# allow partial so the front end can render in-progress traces
trace = _get_tracking_store().get_trace(request_id, allow_partial=True)
except MlflowTracingException:
pass
except MlflowNotImplementedException:
# default to batch get traces if get_trace is not implemented
try:
traces = _get_tracking_store().batch_get_traces([request_id], None)
# For stores that don't support batch get traces, or if the trace data is not stored in the
# tracking store, we need to get the trace data from the artifact repository.
except (MlflowTracingException, MlflowNotImplementedException):
pass
else:
if len(traces) != 1:
raise MlflowException(
f"Trace with id={request_id} not found.",
error_code=RESOURCE_DOES_NOT_EXIST,
)
trace_data = traces[0].data.to_dict()
else:
trace_data = trace.data.to_dict()
Copy link
Member

@harupy harupy Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A helper function doesn't help simplify the code?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Emm we can move this part to a helper function, but I don't think it'll be simpler.

@serena-ruan serena-ruan added this pull request to the merge queue Nov 19, 2025
Merged via the queue into mlflow:master with commit 36649f4 Nov 19, 2025
77 of 79 checks passed
@serena-ruan serena-ruan deleted the stack/get_trace_stores branch November 19, 2025 06:06
mprahl pushed a commit to opendatahub-io/mlflow that referenced this pull request Nov 21, 2025
Signed-off-by: Serena Ruan <serena.rxy@gmail.com>
Tian-Sky-Lan pushed a commit to Tian-Sky-Lan/mlflow that referenced this pull request Nov 24, 2025
Signed-off-by: Serena Ruan <serena.rxy@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rn/none List under Small Changes in Changelogs. team-review Trigger a team review request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants