Skip to content

[MLflow Demo] Base implementation for demo framework#19994

Merged
BenWilson2 merged 1 commit intomlflow:masterfrom
BenWilson2:stack/demo/scaffold
Jan 26, 2026
Merged

[MLflow Demo] Base implementation for demo framework#19994
BenWilson2 merged 1 commit intomlflow:masterfrom
BenWilson2:stack/demo/scaffold

Conversation

@BenWilson2
Copy link
Member

@BenWilson2 BenWilson2 commented Jan 14, 2026

🥞 Stacked PR

Use this link to review incremental changes.


Related Issues/PRs

#xxx

What changes are proposed in this pull request?

Adds the scaffolding framework for MLflow in-product demos.
Going with a template-based ABC approach here to make additions, modifications, and updates / fixes a bit more straightforward for maintenance and extension of these demos.
CI configuration with the first demo data generation (for traces) is added in #19995

How is this PR tested?

  • Existing unit/integration tests
  • New unit/integration tests
  • Manual tests

Does this PR require documentation update?

  • No. You can skip the rest of this section.
  • Yes. I've updated:
    • Examples
    • API references
    • Instructions

Release Notes

Is this a user-facing change?

  • No. You can skip the rest of this section.
  • Yes. Give a description of this change to be included in the release notes for MLflow users.

What component(s), interfaces, languages, and integrations does this PR affect?

Components

  • area/tracking: Tracking Service, tracking client APIs, autologging
  • area/models: MLmodel format, model serialization/deserialization, flavors
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • area/evaluation: MLflow model evaluation features, evaluation metrics, and evaluation workflows
  • area/gateway: MLflow AI Gateway client APIs, server, and third-party integrations
  • area/prompts: MLflow prompt engineering features, prompt templates, and prompt management
  • area/tracing: MLflow Tracing features, tracing APIs, and LLM tracing functionality
  • area/projects: MLproject format, project running backends
  • area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • area/build: Build and test infrastructure for MLflow
  • area/docs: MLflow documentation pages

How should the PR be classified in the release notes? Choose one:

  • rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
  • rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
  • rn/feature - A new user-facing feature worth mentioning in the release notes
  • rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
  • rn/documentation - A user-facing documentation change worth mentioning in the release notes

Should this PR be included in the next patch release?

Yes should be selected for bug fixes, documentation updates, and other small changes. No should be selected for new features and larger changes. If you're unsure about the release classification of this PR, leave this unchecked to let the maintainers decide.

What is a minor/patch release?
  • Minor release: a release that increments the second part of the version number (e.g., 1.2.0 -> 1.3.0).
    Bug fixes, doc updates and new features usually go into minor releases.
  • Patch release: a release that increments the third part of the version number (e.g., 1.2.0 -> 1.2.1).
    Bug fixes and doc updates usually go into patch releases.
  • Yes (this PR will be cherry-picked and included in the next patch release)
  • No (this PR will be included in the next minor release)

Copilot AI review requested due to automatic review settings January 14, 2026 21:55
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a base implementation for a demo framework that allows generating demo data for MLflow features. The framework provides a registry pattern for demo generators with versioning support to automatically regenerate demo data when the schema changes.

Changes:

  • Adds base classes (BaseDemoGenerator, DemoResult) for implementing demo data generators
  • Implements a registry pattern (DemoRegistry) for managing multiple demo generators
  • Includes version tracking to handle demo data migration on updates

Reviewed changes

Copilot reviewed 8 out of 10 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
mlflow/demo/base.py Core framework with abstract base class for generators, dataclass for results, and version management
mlflow/demo/registry.py Registry implementation for discovering and managing demo generators
mlflow/demo/init.py Public API with generate_all_demos() function
mlflow/demo/README.md Documentation on design principles, creating generators, and versioning
mlflow/demo/generators/init.py Empty module for future generator implementations
tests/demo/conftest.py Test fixtures with stub generator implementations
tests/demo/test_base.py Tests for base class validation, versioning, and data existence checks
tests/demo/test_registry.py Tests for registry operations (register, get, list, contains)
tests/demo/test_generate.py Tests for the main generation flow with version management
tests/demo/init.py Empty test module initialization

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions github-actions bot added area/tracking Tracking service, tracking client APIs, autologging rn/feature Mention under Features in Changelogs. labels Jan 14, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Jan 14, 2026

Documentation preview for 6ced72e is available at:

More info
  • Ignore this comment if this PR does not change the documentation.
  • The preview is updated when a new commit is pushed to this PR.
  • This comment was created by this workflow run.
  • The documentation was built by this workflow run.

@BenWilson2 BenWilson2 added the team-review Trigger a team review request label Jan 14, 2026
@github-actions github-actions bot requested review from TomeHirata and harupy January 14, 2026 22:35
@BenWilson2 BenWilson2 requested a review from B-Step62 January 15, 2026 01:51

## Design Principles

1. **Auto-generated on startup** - Demo data is created when `mlflow server` starts, requiring no user action.
Copy link
Member

@harupy harupy Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think users don't want this in prod. We could provide an option to disable demo data generation but that's inconvenient since they have to set it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, I thought we would introduce a UI or cli hook like mlflow demo generate instead of generating a demo data by default

Copy link
Member

@harupy harupy Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The worst scenario is the data generation step has a bug and block users.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, agreed. I'm going to update that internal README with the 2 entry point paths (cli / UI-based ajax call) as well as providing information about why, for the existing users who generate demo data in their tracking server and then upgrade to a newer version of the tracking server why we'll want to have versioning available to prevent potentially broken demo experiences.

navigation_url: URL path to navigate to view the demo data in the UI.
"""

feature: str
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we define an enum for the feature field?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call on keeping this cleaner

"""

feature: str
entities_created: list[str]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to return an identifier instead of the actual entity? Then can we rename this field?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep! Changed to entity_ids to be more accurate.

from mlflow.demo.base import DEMO_EXPERIMENT_NAME, DEMO_PROMPT_PREFIX
```

## Versioning
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this? Can you explain a scenario where this is useful?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some explanation for why we will likely really want this functionality within the README.

@BenWilson2 BenWilson2 changed the title Base implementation for demo framework [MLflow Demo] Base implementation for demo framework Jan 16, 2026
Comment on lines +70 to +77
### When to Bump Version

Bump the version when making changes to demo data that require regeneration:

- Changing the structure of generated traces/spans
- Adding new required fields to assessments or evaluations
- Modifying prompt templates
- Any change that makes old demo data incompatible with the current UI
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suppose we have a typo in the demo data. Do we need to bump the version to fix the typo?

Copy link
Member

@harupy harupy Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears we always to need to bump the version to refresh the demo data in a user machine

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should to ensure that if the demo data is already on a running tracking server, we can hot reload only for version mismatches. The reason I think this is important is because of the latency involved with generating trace linkages (writing association table mappings for trace linking takes several seconds) and forcing reload of the contents violates the goal of idempotency in the data generation.
Hopefully we won't have typos though, since lint rules are pretty solid.

def _data_exists(self) -> bool:
"""Check if demo data exists (regardless of version)."""

def delete_demo(self) -> None:
Copy link
Collaborator

@TomeHirata TomeHirata Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we @abstractmethod here too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't want to force demos that don't need direct cleanup of data (transitive demos that might do something with data that another demo generates to showcase functionality) to have to create a concrete implementation that is a no-op. It's purely to reduce boilerplate.

- Modifying prompt templates
- Any change that makes old demo data incompatible with the current UI

## Creating a New Generator
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we expect users to implement a custom demo generator?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not at all. The README is for maintainers / contributors for providing guidance on how to add new demos. Added statements to this doc to make that clear.

- Creates a temporary, self-contained environment (SQLite in temp directory)
- Generates demo data automatically on startup
- Opens browser directly to the MLflow Demo experiment
- Auto-cleanup on exit
Copy link
Member

@harupy harupy Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason we don't cache the generated data? This would be painful if the demo data generation is slow (e.g., 10 seconds).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The full demo data takes less than 1s to generate. It is faster than most loading spinners for other pages that have even modest amounts of data.

Comment on lines +31 to +33
### 2. Launch Demo Button (Home Page)

For users who start `mlflow server` normally:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need another entrypoint?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For testing out the functionality in corporate environments where the server and associated commands are inaccessible to users, having the ability to generate this data silently from within the UI is critical.

from dataclasses import dataclass
from enum import Enum

DEMO_EXPERIMENT_NAME = "MLflow Demo"
Copy link
Member

@harupy harupy Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this experiment deletable? What if a user accidentally removes it and wants to restore, or a user deletes it after trying the demo, then another user attempts to do the demo?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, updated the README with all pertinent details

@BenWilson2 BenWilson2 force-pushed the stack/demo/scaffold branch 2 times, most recently from 1a722f7 to ba7f215 Compare January 22, 2026 17:19
@BenWilson2 BenWilson2 mentioned this pull request Jan 23, 2026
29 tasks
Comment on lines +271 to +277
### Test Structure

```
tests/demo/
├── conftest.py # Fixtures (tracking_uri for isolated environments)
├── test_base.py # BaseDemoGenerator tests
├── test_registry.py # DemoRegistry tests
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove this. Agents can figure it out without this.

Copy link
Member

@harupy harupy Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this README.md is too detailed. That makes it easy for it to become outdated and hard to keep in sync with the codebase. Let's keep only the essentials.

Copy link
Member

@harupy harupy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@BenWilson2 BenWilson2 mentioned this pull request Jan 23, 2026
29 tasks
Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
@BenWilson2 BenWilson2 added this pull request to the merge queue Jan 26, 2026
Merged via the queue into mlflow:master with commit 71495e5 Jan 26, 2026
52 checks passed
@BenWilson2 BenWilson2 deleted the stack/demo/scaffold branch January 26, 2026 14:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/tracking Tracking service, tracking client APIs, autologging rn/feature Mention under Features in Changelogs. team-review Trigger a team review request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants