[Endpoints] [3/x] Entities base definitions#19004
Conversation
f06dead to
cba8c42
Compare
|
Documentation preview for c017f38 is available at: More info
|
cba8c42 to
58189cc
Compare
e821af0 to
7c62891
Compare
c02afd1 to
384adf3
Compare
mlflow/entities/endpoint.py
Outdated
| last_updated_at: int | ||
| created_by: str | None = None | ||
| last_updated_by: str | None = None | ||
| endpoint_count: int = 0 |
There was a problem hiding this comment.
Can we make it lazy load property? IIUC this requires querying the mapping table, but don't use referenced endpoint count for every use case.
There was a problem hiding this comment.
The endpoint count can be removed - we can infer this with the length of the mappings present in the model binding table. This is safe to delete.
mlflow/entities/endpoint.py
Outdated
| mapping_id: str | ||
| endpoint_id: str | ||
| model_definition_id: str | ||
| model_definition: ModelDefinition | None |
There was a problem hiding this comment.
Why do we need to store this within the mapping entity?
There was a problem hiding this comment.
The goal here is to minimize the query complexity on the front-end. Without having this embedded within this entity (we use it directly in the UI when populating the model information card for an endpoint), we would have N queries for each model that attached to an endpoint and then perform client-side joins to populate the page.
While this wouldn't be my choice in design for SDK / cli APIs, I intentionally went with this approach here for the UI in order to minimize the front end complexity and to reduce page render time.
example: EndpointDetailsPage.tsx directly accesses the result of the return of this entity.
mlflow/entities/endpoint.py
Outdated
| name: str | ||
| created_at: int | ||
| last_updated_at: int | ||
| model_mappings: list[EndpointModelMapping] = field(default_factory=list) |
There was a problem hiding this comment.
I think this should also be queried lazily
There was a problem hiding this comment.
The primary use of this in the UI is to display critical information in order to populate the reference fields that are directly displayed. We don't really have a use case in the UI display for non-eager return of this in the main endpoint listing page.
Where we directly need this:
- Provider filtering (providers are mapped to models, not to endpoints as an endpoint can fallback to using a model from a different provider)
- Displayed model contents for each endpoint for link references to the models details page entries
- Rendering of named models and underlying provider model names
If scale is a concern, I think we can introduce pagination in the future (i.e. users create thousands of endpoints), but the alternative (lazy loading) for the UI would be many queries being issued in order to populate the required aspects of the list page, dramatically increasing the front end complexity and forcing spin-loading elements within the page.
There was a problem hiding this comment.
Thanks for the clarification! It makes sense to include model mapping in the entity to show the model names on UI and allow filter based on it.
mlflow/entities/endpoint.py
Outdated
| last_updated_at: Timestamp (milliseconds) when the binding was last updated. | ||
| created_by: User ID who created the binding. | ||
| last_updated_by: User ID who last updated the binding. | ||
| endpoint_name: Optional endpoint name, populated via JOIN for display. |
There was a problem hiding this comment.
| endpoint_name: Optional endpoint name, populated via JOIN for display. | |
| endpoint_name: Optional endpoint name, populated via JOIN for display. |
Where do we plan to display this?
There was a problem hiding this comment.
The endpoint name is displayed on the main list page, in the model mappings detail pages, and in the key mapping details pages.
mlflow/entities/endpoint.py
Outdated
| created_by: str | None = None | ||
| last_updated_by: str | None = None | ||
| endpoint_name: str | None = None | ||
| model_mappings: list[EndpointModelMapping] = field(default_factory=list) |
There was a problem hiding this comment.
Why is this a part of a binding entity?
There was a problem hiding this comment.
This was originally intended to fulfill the 'filter the models for a single-model use resource_id' but it's overly complicated and doesn't need to be overloaded in this. We don't use this field in the UI and splitting out the resource-based binding information into a separate API is far simpler and more tailored to what is needed. I'm going to simplify this and then add the dedicated API for resources to fetch their binding information.
There was a problem hiding this comment.
Can we name this gateway_endpoint.py or add a docstring at the top explain these models are used within Gateway? The term "endpoint" "model" "resource" is pretty generic without that context.
There was a problem hiding this comment.
yeah I'll preface these with gateway so that it's easier to see their relationships ontologically
mlflow/entities/endpoint.py
Outdated
|
|
||
|
|
||
| @dataclass | ||
| class ModelConfig(_MlflowObject): |
There was a problem hiding this comment.
What is the purpose of these entities? I casually read through the remaining stack but didn't find a usage. If we need these, can we define this somewhere in more limited scope than general entities to avoid confusion with ModelDefinition and Endpoint entities?
There was a problem hiding this comment.
These are the interfaces for the downstream access of decrypted secrets server-side. They won't be exposed in the rest interface (non-public). The sql store will have the implementation that returns these with an internal RPC.
There was a problem hiding this comment.
Can we move these entities into different module? All classes under mlflow/entities are for public interface. Also it doesn't need to be _MlflowObject subclass then cuz we probably want to be stricter about the print and serde method, rather than using its base methods.
There was a problem hiding this comment.
yep! Since these are entirely UI-only I'll migrate them to make it clear that they're isolated. Good call.
There was a problem hiding this comment.
I'm not sure these tests (test_endpoint.py and test_secrets.py) are meaningful.... they are basically only testing default dataclass constructor right...?
There was a problem hiding this comment.
Yeah in this PR they're nonsense stubs. The full test suite is added with the addition of the from_proto() / to_proto() methods in the rest branch.
There was a problem hiding this comment.
Can we remove these tests? I saw rest branch adds more tests, but my point was we don't want to have non-useful tests, because that makes the code lengthy and can hide important details.
14fce99 to
09aba4e
Compare
Signed-off-by: Ben Wilson <benjamin.wilson@databricks.com>
09aba4e to
3c17d6c
Compare
There was a problem hiding this comment.
Pull request overview
This PR introduces base entity definitions for gateway endpoints and secrets management, along with comprehensive cryptographic utilities for secure secret storage using envelope encryption. The implementation adds support for storing encrypted API keys and credentials with KEK/DEK encryption schemes.
Key Changes:
- Adds cryptographic utilities for AES-256-GCM encryption with KEK/DEK envelope encryption
- Introduces entity definitions for gateway endpoints, model definitions, and secrets
- Implements CLI commands for KEK rotation operations
- Provides comprehensive test coverage for cryptography and entity classes
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
mlflow/utils/cryptography.py |
Core cryptographic utilities implementing AES-256-GCM encryption, PBKDF2 key derivation, and envelope encryption with KEK/DEK pattern |
mlflow/entities/gateway_secrets.py |
Entity definition for encrypted secrets with LLM provider credentials |
mlflow/entities/secrets.py |
Duplicate/unused entity file with minor docstring differences |
mlflow/entities/gateway_endpoint.py |
Entity definitions for endpoints, model definitions, mappings, and configurations |
mlflow/entities/__init__.py |
Updated exports to include new gateway and secret entities |
mlflow/cli/cryptography.py |
CLI commands for KEK rotation operations with comprehensive user guidance |
mlflow/cli/__init__.py |
Integration of cryptography CLI commands |
tests/utils/test_cryptography.py |
Comprehensive test coverage for cryptographic functions |
tests/entities/test_gateway_secrets.py |
Test coverage for secret entity creation and attributes |
tests/entities/test_gateway_endpoint.py |
Test coverage for endpoint and model definition entities |
tests/cli/test_cryptography.py |
Test coverage for CLI commands with mocked database operations |
docs/api_reference/api_inventory.txt |
API documentation inventory updates |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
mlflow/entities/secrets.py
Outdated
| from dataclasses import dataclass | ||
|
|
||
| from mlflow.entities._mlflow_object import _MlflowObject | ||
|
|
||
|
|
||
| @dataclass | ||
| class GatewaySecret(_MlflowObject): | ||
| """ | ||
| Represents an encrypted secret for authenticating with LLM providers. | ||
|
|
||
| Store encrypted API keys and authentication credentials using envelope encryption. | ||
| The actual secret value is encrypted with a DEK (Data Encryption Key), which is itself | ||
| encrypted by a KEK (Key Encryption Key) for secure storage. | ||
|
|
||
| Args: | ||
| secret_id: Unique identifier for this secret. | ||
| secret_name: User-friendly name for the secret (must be unique). | ||
| masked_value: Masked version of the secret for display (e.g., "sk-...xyz123"). | ||
| provider: LLM provider this secret is for (e.g., "openai", "anthropic"). | ||
| created_at: Timestamp (milliseconds) when the secret was created. | ||
| last_updated_at: Timestamp (milliseconds) when the secret was last updated. | ||
| created_by: User ID who created the secret. | ||
| last_updated_by: User ID who last updated the secret. | ||
| """ | ||
|
|
||
| secret_id: str | ||
| secret_name: str | ||
| masked_value: str | ||
| created_at: int | ||
| last_updated_at: int | ||
| provider: str | None = None | ||
| created_by: str | None = None | ||
| last_updated_by: str | None = None |
There was a problem hiding this comment.
Minor documentation inconsistency: Line 11 says "Store encrypted API keys..." while the identical class in gateway_secrets.py says "Secrets store encrypted API keys...". While mlflow/entities/secrets.py exists, it's not imported in __init__.py - only gateway_secrets.py is imported. This file appears to be unused and should likely be removed to avoid confusion, or the import should be updated to use this file instead.
| from dataclasses import dataclass | |
| from mlflow.entities._mlflow_object import _MlflowObject | |
| @dataclass | |
| class GatewaySecret(_MlflowObject): | |
| """ | |
| Represents an encrypted secret for authenticating with LLM providers. | |
| Store encrypted API keys and authentication credentials using envelope encryption. | |
| The actual secret value is encrypted with a DEK (Data Encryption Key), which is itself | |
| encrypted by a KEK (Key Encryption Key) for secure storage. | |
| Args: | |
| secret_id: Unique identifier for this secret. | |
| secret_name: User-friendly name for the secret (must be unique). | |
| masked_value: Masked version of the secret for display (e.g., "sk-...xyz123"). | |
| provider: LLM provider this secret is for (e.g., "openai", "anthropic"). | |
| created_at: Timestamp (milliseconds) when the secret was created. | |
| last_updated_at: Timestamp (milliseconds) when the secret was last updated. | |
| created_by: User ID who created the secret. | |
| last_updated_by: User ID who last updated the secret. | |
| """ | |
| secret_id: str | |
| secret_name: str | |
| masked_value: str | |
| created_at: int | |
| last_updated_at: int | |
| provider: str | None = None | |
| created_by: str | None = None | |
| last_updated_by: str | None = None | |
| # FILE REMOVED: This file was unused and duplicated functionality from gateway_secrets.py. |
mlflow/cli/cryptography.py
Outdated
| @click.group("crypto", help="Commands for managing MLflow's cryptographic passphrase.") | ||
| def commands(): | ||
| """ | ||
| MLflow cryptopgraphic management CLI. Allows for the management of the envelope |
There was a problem hiding this comment.
There's a typo in the docstring: "cryptopgraphic" should be "cryptographic".
| MLflow cryptopgraphic management CLI. Allows for the management of the envelope | |
| MLflow cryptographic management CLI. Allows for the management of the envelope |
There was a problem hiding this comment.
Looks good once #19004 (comment) and #19004 (comment) is addressed!
3c17d6c to
c017f38
Compare
| endpoint_id: ID of the endpoint. | ||
| model_definition_id: ID of the model definition. | ||
| model_definition: The full model definition (populated via JOIN). | ||
| weight: Routing weight for traffic distribution (default 1). |
There was a problem hiding this comment.
@BenWilson2 What does the unit of weight here? Is it percentage?
🥞 Stacked PR
Use this link to review incremental changes.
Related Issues/PRs
#xxxWhat changes are proposed in this pull request?
Adds the base entities definitions for secrets and endpoints definitions. Note that proto interfaces are handled in a later PR in this stack.
How is this PR tested?
Does this PR require documentation update?
Release Notes
Is this a user-facing change?
What component(s), interfaces, languages, and integrations does this PR affect?
Components
area/tracking: Tracking Service, tracking client APIs, autologgingarea/models: MLmodel format, model serialization/deserialization, flavorsarea/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registryarea/scoring: MLflow Model server, model deployment tools, Spark UDFsarea/evaluation: MLflow model evaluation features, evaluation metrics, and evaluation workflowsarea/gateway: MLflow AI Gateway client APIs, server, and third-party integrationsarea/prompts: MLflow prompt engineering features, prompt templates, and prompt managementarea/tracing: MLflow Tracing features, tracing APIs, and LLM tracing functionalityarea/projects: MLproject format, project running backendsarea/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev serverarea/build: Build and test infrastructure for MLflowarea/docs: MLflow documentation pagesHow should the PR be classified in the release notes? Choose one:
rn/none- No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" sectionrn/breaking-change- The PR will be mentioned in the "Breaking Changes" sectionrn/feature- A new user-facing feature worth mentioning in the release notesrn/bug-fix- A user-facing bug fix worth mentioning in the release notesrn/documentation- A user-facing documentation change worth mentioning in the release notesShould this PR be included in the next patch release?
Yesshould be selected for bug fixes, documentation updates, and other small changes.Noshould be selected for new features and larger changes. If you're unsure about the release classification of this PR, leave this unchecked to let the maintainers decide.What is a minor/patch release?
Bug fixes, doc updates and new features usually go into minor releases.
Bug fixes and doc updates usually go into patch releases.