[UI Telemetry] Update documentation page with details on UI telemetry#19427
[UI Telemetry] Update documentation page with details on UI telemetry#19427daniellok-db merged 7 commits intomlflow:masterfrom
Conversation
db81eb3 to
b264a48
Compare
There was a problem hiding this comment.
Pull request overview
This PR adds comprehensive UI telemetry functionality to MLflow, allowing the collection of usage data from user interactions with the MLflow UI. The implementation uses a SharedWorker architecture to batch and send telemetry events efficiently across multiple browser tabs.
Key Changes
- Implements a SharedWorker-based telemetry client that batches UI interaction events and sends them to the server
- Adds server-side handlers for receiving and processing UI telemetry data with configurable opt-in/opt-out controls
- Adds a Settings page in the UI where users can control their telemetry preferences
- Updates documentation to describe what UI telemetry data is collected and how to opt out
Reviewed changes
Copilot reviewed 26 out of 26 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
tests/telemetry/test_client.py |
Adds test for batch processing of telemetry records |
tests/server/test_handlers.py |
Adds comprehensive tests for UI telemetry GET/POST handlers |
tests/server/conftest.py |
Adds test fixtures for telemetry testing |
mlflow/telemetry/client.py |
Implements add_records() method for batching multiple telemetry records |
mlflow/server/handlers.py |
Adds GET/POST handlers for UI telemetry with caching and configuration management |
mlflow/server/__init__.py |
Registers new UI telemetry API endpoints |
mlflow/server/js/tsconfig.json |
Adds WebWorker library support for TypeScript |
mlflow/server/js/src/telemetry/worker/*.ts |
Implements SharedWorker for telemetry logging with queue management |
mlflow/server/js/src/telemetry/TelemetryClient.ts |
Client-side API for logging UI events to the worker |
mlflow/server/js/src/telemetry/TelemetryInfoAlert.tsx |
Alert component informing users about telemetry |
mlflow/server/js/src/telemetry/README.md |
Technical documentation for the telemetry architecture |
mlflow/server/js/src/settings/SettingsPage.tsx |
Settings UI for managing telemetry preferences |
mlflow/server/js/src/experiment-tracking/routes.ts |
Adds routing for the settings page |
mlflow/server/js/src/common/components/MlflowSidebar.tsx |
Adds Settings link to the navigation sidebar |
mlflow/server/js/src/app.tsx |
Integrates telemetry client with DesignSystemEventProvider |
mlflow/server/js/craco.config.js |
Configures webpack to bundle the telemetry worker separately |
mlflow/server/js/src/lang/default/en.json |
Adds translations for telemetry-related UI text |
mlflow/server/js/src/home/HomePage.tsx |
Displays telemetry info alert on the home page |
docs/docs/community/usage-tracking.mdx |
Documents UI telemetry data collection and opt-out procedures |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| </tr> | ||
| <tr> | ||
| <td>Timestamp</td> | ||
| <td>The client-side timestam of when the interaction occurred</td> |
There was a problem hiding this comment.
Spelling error: "timestam" should be "timestamp"
| <td>The client-side timestam of when the interaction occurred</td> | |
| <td>The client-side timestamp of when the interaction occurred</td> |
|
|
||
| **LogQueue.ts**: | ||
|
|
||
| - Simple class that batches logs and uploads them to the server every 15s. |
There was a problem hiding this comment.
The documentation inconsistency between the README and the actual LogQueue implementation. The README states logs are batched and uploaded "every 15s", but the actual implementation in LogQueue.ts uses FLUSH_INTERVAL_MS = 30000 (30 seconds). Update the documentation to match the actual implementation.
| - Simple class that batches logs and uploads them to the server every 15s. | |
| - Simple class that batches logs and uploads them to the server every 30s. |
| /** | ||
| * LogQueue for batching and uploading telemetry events | ||
| * | ||
| * Maintains a queue of telemetry records and flushes them every 15 seconds |
There was a problem hiding this comment.
The documentation comment incorrectly states the flush interval is 15 seconds, but the constant FLUSH_INTERVAL_MS is set to 30000 (30 seconds). Update the comment to accurately reflect the actual flush interval.
| * Maintains a queue of telemetry records and flushes them every 15 seconds | |
| * Maintains a queue of telemetry records and flushes them every 30 seconds |
|
|
||
|
|
||
| # Cache for telemetry config with 3 hour TTL | ||
| _telemetry_config_cache = TTLCache(maxsize=1, ttl=10800) |
There was a problem hiding this comment.
The cache TTL is hardcoded to 10800 seconds (3 hours). Consider extracting this as a named constant with a descriptive name like TELEMETRY_CONFIG_CACHE_TTL_SECONDS to improve maintainability and make it easier to adjust this value in the future.
| _telemetry_config_cache = TTLCache(maxsize=1, ttl=10800) | |
| TELEMETRY_CONFIG_CACHE_TTL_SECONDS = 10800 | |
| _telemetry_config_cache = TTLCache(maxsize=1, ttl=TELEMETRY_CONFIG_CACHE_TTL_SECONDS) |
|
Documentation preview for 85a0c7a is available at: Changed Pages (1)
More info
|
| <tr> | ||
| <td>Component ID of interactive UI elements</td> | ||
| <td>An ID string of an interactive element (e.g. button, switch, link, input field) in the UI. A log is generated upon clicking, typing, or otherwise interacting with such elements. A comprehensive list of component ID values can be found by [this search query](https://github.com/search?q=repo%3Amlflow%2Fmlflow%20componentId%3D&type=code).</td> | ||
| <td>`mlflow.prompts.list.create` (identifier for the "Create prompt" button on the prompts page)</td> | ||
| <td>To understand the usage patterns of different UI pages and features</td> | ||
| </tr> | ||
| <tr> | ||
| <td>Metadata associated with UI interactions</td> | ||
| <td>See [below table](#ui-interaction-metadata) for metadata associated with UI interaction logs.</td> | ||
| <td>`{ "isRemote": true, "browserFamily": "Chrome", "isMobile": false, "eventType": "onClick", "componentViewId": "88fc9edd-5e9e-4a17-abd2-c543f505b8eb", "componentId": "mlflow.prompts.list.create", "componentType": "button", timestamp_ns: 1765784028467000000 }`</td> | ||
| <td>To understand UI usage patterns in greater depth, and to inform prioritization of different platforms, browsers, and user flows.</td> | ||
| </tr> |
There was a problem hiding this comment.
I thought these are included in parameters field?
There was a problem hiding this comment.
they are but i think it's not an important distinction (user probably doesn't care about the structure of the payload, just what data is contained inside), i also felt that including all those params into the existing table would be a bit noisy since they only apply to UI events
There was a problem hiding this comment.
Can we include them in Tracked Events table?
| Organizations can disable telemetry by blocking network access to `https://config.mlflow-telemetry.io`. When this endpoint is unreachable, MLflow automatically disables telemetry for the SDK. | ||
|
|
||
| To opt out of UI telemetry, you can block network access to `https://d139nb52glx00z.cloudfront.net` from the MLflow server. Similar to above, if this endpoint is unreachable, UI telemetry will be disabled. |
There was a problem hiding this comment.
i wonder if we should make it easy and just say that you can disable all traffic to the mlflow-telemetry.io domain, since that covers both config and ingestion
There was a problem hiding this comment.
Yes I think we don't need to mention this section especially https://d139nb52glx00z.cloudfront.net specifically. If we change it to be the same domain as sdk later then people won't check this page again to disable it, and for users who look at this section I feel like they may just opt out entire telemetry
There was a problem hiding this comment.
done, simplified the section
2f0a68f to
dd01b3d
Compare
| <td>Browser family</td> | ||
| <td>An enumerated string describing the browser family</td> | ||
| <td>`Chrome`, `Safari`, `Firefox`, `Other`</td> | ||
| </tr> | ||
| <tr> | ||
| <td>Mobile status</td> | ||
| <td>A boolean indicating whether or not the event took place on a mobile device</td> | ||
| <td>`false`</td> | ||
| </tr> | ||
| <tr> |
There was a problem hiding this comment.
just to note: these do not actually exist in the logs right now but including it here so we can add them in the future without updating the docs
There was a problem hiding this comment.
When do we plan to add them? we can update the doc later when adding them right, people may ask you about this data if seeing this :)
There was a problem hiding this comment.
makes sense, removing for now then
b2e82e6 to
43f8e0c
Compare
serena-ruan
left a comment
There was a problem hiding this comment.
LGTM once #19427 (comment) is addressed!
🥞 Stacked PR
Use this link to review incremental changes.
Related Issues/PRs
#xxxWhat changes are proposed in this pull request?
This PR updates the usage-tracking page with information about the new UI telemetry.
How is this PR tested?
PR docs preview
Does this PR require documentation update?
Release Notes
Is this a user-facing change?
What component(s), interfaces, languages, and integrations does this PR affect?
Components
area/tracking: Tracking Service, tracking client APIs, autologgingarea/models: MLmodel format, model serialization/deserialization, flavorsarea/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registryarea/scoring: MLflow Model server, model deployment tools, Spark UDFsarea/evaluation: MLflow model evaluation features, evaluation metrics, and evaluation workflowsarea/gateway: MLflow AI Gateway client APIs, server, and third-party integrationsarea/prompts: MLflow prompt engineering features, prompt templates, and prompt managementarea/tracing: MLflow Tracing features, tracing APIs, and LLM tracing functionalityarea/projects: MLproject format, project running backendsarea/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev serverarea/build: Build and test infrastructure for MLflowarea/docs: MLflow documentation pagesHow should the PR be classified in the release notes? Choose one:
rn/none- No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" sectionrn/breaking-change- The PR will be mentioned in the "Breaking Changes" sectionrn/feature- A new user-facing feature worth mentioning in the release notesrn/bug-fix- A user-facing bug fix worth mentioning in the release notesrn/documentation- A user-facing documentation change worth mentioning in the release notesShould this PR be included in the next patch release?
Yesshould be selected for bug fixes, documentation updates, and other small changes.Noshould be selected for new features and larger changes. If you're unsure about the release classification of this PR, leave this unchecked to let the maintainers decide.What is a minor/patch release?
Bug fixes, doc updates and new features usually go into minor releases.
Bug fixes and doc updates usually go into patch releases.