Skip to content

Fix the demo generation and deletion when workspaces are enabled#20733

Merged
B-Step62 merged 2 commits intomlflow:masterfrom
mprahl:workspace-demo
Feb 14, 2026
Merged

Fix the demo generation and deletion when workspaces are enabled#20733
B-Step62 merged 2 commits intomlflow:masterfrom
mprahl:workspace-demo

Conversation

@mprahl
Copy link
Collaborator

@mprahl mprahl commented Feb 11, 2026

Related Issues/PRs

#xxx

What changes are proposed in this pull request?

  • Add workspace awareness to the generate demo endpoint. The issue is that some threads were created which didn't keep the workspace context var. By using the WorkspaceContext context manager, it sets the workspace env var temporarily and then unsets it.
  • Add workspace awareness to the Settings page since now that this page has a button to clear the demo data, it needs to know the workspace it's in.

How is this PR tested?

  • Existing unit/integration tests
  • New unit/integration tests
  • Manual tests

Does this PR require documentation update?

  • No. You can skip the rest of this section.
  • Yes. I've updated:
    • Examples
    • API references
    • Instructions

Does this PR require updating the MLflow Skills repository?

  • No. You can skip the rest of this section.
  • Yes. Please link the corresponding PR or explain how you plan to update it.

Release Notes

Is this a user-facing change?

  • No. You can skip the rest of this section.
  • Yes. Give a description of this change to be included in the release notes for MLflow users.

What component(s), interfaces, languages, and integrations does this PR affect?

Components

  • area/tracking: Tracking Service, tracking client APIs, autologging
  • area/models: MLmodel format, model serialization/deserialization, flavors
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • area/evaluation: MLflow model evaluation features, evaluation metrics, and evaluation workflows
  • area/gateway: MLflow AI Gateway client APIs, server, and third-party integrations
  • area/prompts: MLflow prompt engineering features, prompt templates, and prompt management
  • area/tracing: MLflow Tracing features, tracing APIs, and LLM tracing functionality
  • area/projects: MLproject format, project running backends
  • area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • area/build: Build and test infrastructure for MLflow
  • area/docs: MLflow documentation pages

How should the PR be classified in the release notes? Choose one:

  • rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
  • rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
  • rn/feature - A new user-facing feature worth mentioning in the release notes
  • rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
  • rn/documentation - A user-facing documentation change worth mentioning in the release notes

Should this PR be included in the next patch release?

Yes should be selected for bug fixes, documentation updates, and other small changes. No should be selected for new features and larger changes. If you're unsure about the release classification of this PR, leave this unchecked to let the maintainers decide.

What is a minor/patch release?
  • Minor release: a release that increments the second part of the version number (e.g., 1.2.0 -> 1.3.0).
    Bug fixes, doc updates and new features usually go into minor releases.
  • Patch release: a release that increments the third part of the version number (e.g., 1.2.0 -> 1.2.1).
    Bug fixes and doc updates usually go into patch releases.
  • Yes (this PR will be cherry-picked and included in the next patch release)
  • No (this PR will be included in the next minor release)

Copilot AI review requested due to automatic review settings February 11, 2026 21:13
@github-actions
Copy link
Contributor

🛠 DevTools 🛠

Install mlflow from this PR

# mlflow
pip install git+https://github.com/mlflow/mlflow.git@refs/pull/20733/merge
# mlflow-skinny
pip install git+https://github.com/mlflow/mlflow.git@refs/pull/20733/merge#subdirectory=libs/skinny

For Databricks, use the following command:

%sh curl -LsSf https://raw.githubusercontent.com/mlflow/mlflow/HEAD/dev/install-skinny.sh | sh -s pull/20733/merge

@github-actions github-actions bot added area/tracking Tracking service, tracking client APIs, autologging rn/none List under Small Changes in Changelogs. labels Feb 11, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR makes demo generation/deletion and the Settings page workspace-aware so that demo operations correctly target the active workspace even when child threads are spawned during demo generation.

Changes:

  • Wrap generate_all_demos() in WorkspaceContext(get_request_workspace()) and add a regression test ensuring workspace propagation to child threads.
  • Make /settings workspace-scoped in routing utilities and navigation (no longer treated as a global route).
  • Update Settings “Clear all demo data” action to use fetchEndpointRaw (so workspace headers are included) and add a UI test.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
mlflow/demo/__init__.py Wrap demo generation in WorkspaceContext to propagate workspace to child threads.
tests/demo/test_generate.py Add test verifying workspace visibility inside threads spawned by demo generators.
mlflow/server/js/src/workspaces/utils/WorkspaceUtils.ts Treat settings as workspace-scoped by removing it from “always global” routes.
mlflow/server/js/src/workspaces/utils/WorkspaceUtils.test.ts Update routing tests to reflect settings being workspace-scoped.
mlflow/server/js/src/shared/web-shared/model-trace-explorer/RoutingUtils.tsx Align shared routing logic so settings is not treated as always-global.
mlflow/server/js/src/shared/web-shared/genai-traces-table/utils/RoutingUtils.tsx Align shared routing logic so settings is not treated as always-global.
mlflow/server/js/src/settings/SettingsPage.tsx Use fetchEndpointRaw for demo delete so workspace header is included.
mlflow/server/js/src/settings/SettingsPage.test.tsx Add test asserting demo delete uses fetchEndpointRaw with correct endpoint/method.
mlflow/server/js/src/experiment-tracking/route-defs.ts Remove the globalRoute designation for settings.
mlflow/server/js/src/common/components/MlflowSidebar.tsx Allow workspace prefixing for the Settings link.

Comment on lines +30 to +35
# Propagate the workspace to the environment so that child threads spawned during
# demo generation (e.g. by the evaluation harness's ThreadPoolExecutor) can resolve
# the active workspace via the MLFLOW_WORKSPACE env-var fallback. The ContextVar
# set by the server middleware is thread-local and is invisible to new threads.
with WorkspaceContext(get_request_workspace()):
for name in generator_names:
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WorkspaceContext mutates the process-wide MLFLOW_WORKSPACE env var for the entire duration of demo generation. Since generate_all_demos() is used by the /mlflow/demo/generate handler and demo generation can be long-running, concurrent requests in the same process could observe the wrong workspace and create/delete demo data in an unintended workspace. Consider avoiding env-var mutation for request handling (e.g., propagate the ContextVar into spawned threads via contextvars.copy_context() / executor wrappers, or serialize demo generation with a lock/queue so only one workspace can be active at a time in-process).

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The env var race isn't a real concern as uvicorn workers are separate processes with isolated environments. Within a single worker, requests are handled sequentially so the with WorkspaceContext(...) block completes before another request can mutate the env var.

contextvars.copy_context() wouldn't help here either, since ThreadPoolExecutor.submit() doesn't propagate the calling context into worker threads. That's the exact problem this change solves. You'd need to wrap every executor.submit() call across all downstream code (evaluation harness, etc.), which is far more invasive than the single-line WorkspaceContext wrapper.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that context var doesn't help here. But I think we still have concurrency problem when FastAPI mounts the Flask handler - it runs the sync handler within a thread pool. Shall we add a simple lock mechanism here to prevent demo experiment is generated concurrently within a single process? It should not affect user experience a lot.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Added a threading lock on the call in the handler.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 11, 2026

Documentation preview for 989a0d4 is available at:

More info
  • Ignore this comment if this PR does not change the documentation.
  • The preview is updated when a new commit is pushed to this PR.
  • This comment was created by this workflow run.
  • The documentation was built by this workflow run.

refresh: bool = False,
features: list[str] | None = None,
) -> list[DemoResult]:
from mlflow.utils.workspace_context import WorkspaceContext, get_request_workspace
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this import to the module level?



def test_generate_all_demos_propagates_workspace_to_child_threads():
"""Verify that the workspace set by server middleware is visible in child threads.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the expectation is obvious from the test title, can we remove/shorten the doc string?

Copy link
Collaborator

@TomeHirata TomeHirata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left two nits, otherwise LGTM

Copy link
Collaborator

@B-Step62 B-Step62 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One clarification question about concurrency problem here: https://github.com/mlflow/mlflow/pull/20733/changes#r2797550436

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

Comment on lines +84 to +88
class ThreadSpawningGenerator(BaseDemoGenerator):
name = DemoFeature.TRACES

def generate(self) -> DemoResult:
def _worker():
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generate_all_demos() always calls generator.store_version(). Since this test generator subclasses BaseDemoGenerator, that will execute BaseDemoGenerator.store_version() and instantiate a real tracking store, which can create ./mlruns and introduce filesystem side effects in a unit test. Consider overriding store_version() on ThreadSpawningGenerator (no-op) or patching the tracking store / setting a temp tracking URI within the test.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed the comment.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated no new comments.

The issue is that some threads were created which didn't keep the
workspace context var. By using the WorkspaceContext context manager, it
sets the workspace env var temporarily and then unsets it.

Signed-off-by: mprahl <mprahl@users.noreply.github.com>
Now that this page has a button to clear the demo data, it needs to know
the workspace it's in.

Signed-off-by: mprahl <mprahl@users.noreply.github.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

Comment on lines +5268 to +5272
# Serialize demo generation so concurrent requests (e.g. FastAPI running Flask
# handlers in a thread pool) cannot race on the process-wide MLFLOW_WORKSPACE
# env var that WorkspaceContext temporarily sets during generate_all_demos.
_demo_generate_lock = threading.Lock()

Copy link

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_demo_generate_lock serializes demo generation, but /mlflow/demo/delete can still run concurrently with /mlflow/demo/generate in the same process, which can lead to partially-generated or partially-deleted demo data if a delete happens mid-generation. Consider reusing this lock in _delete_demo() as well (or using a shared lock around both endpoints) so generate/delete are mutually exclusive per process.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator

@B-Step62 B-Step62 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

</span>
</MlflowSidebarLink>
<MlflowSidebarLink
disableWorkspacePrefix
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[not in the scope of this PR] The current setting page is mostly "user" level. We should think how to support different scope of settings (user, workspace, admin).

@B-Step62 B-Step62 added this pull request to the merge queue Feb 14, 2026
Merged via the queue into mlflow:master with commit b147981 Feb 14, 2026
57 of 58 checks passed
daniellok-db pushed a commit to daniellok-db/mlflow that referenced this pull request Mar 5, 2026
…low#20733)

Signed-off-by: mprahl <mprahl@users.noreply.github.com>
daniellok-db pushed a commit to daniellok-db/mlflow that referenced this pull request Mar 5, 2026
…low#20733)

Signed-off-by: mprahl <mprahl@users.noreply.github.com>
daniellok-db pushed a commit that referenced this pull request Mar 5, 2026
)

Signed-off-by: mprahl <mprahl@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/tracking Tracking service, tracking client APIs, autologging rn/none List under Small Changes in Changelogs. size/M v3.10.1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants