Add model_uuid to Java Model#5165
Conversation
|
Fill the PR description. |
|
Have you test on load java model from an model logged by old version mlflow (which model_uuid does not exists) |
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
| } | ||
|
|
||
| /** @return The MLflow model's uuid */ | ||
| public Optional<String> getModelUuid() { |
There was a problem hiding this comment.
nit: <4 Acronym naming convention in Java typically preserves the upper case acronym, particularly if the std library preserves.
public Optional<String> getModelUUID() {There was a problem hiding this comment.
@BenWilson2 Can we use getModelUuid since we already have getRunUuid?
| // Set the model uuid if it's absent. | ||
| if (!model.getModelUuid().isPresent()) { | ||
| String uuid = UUID.randomUUID().toString().replace("-", ""); | ||
| model.setModelUuid(uuid); |
There was a problem hiding this comment.
Shall we set a new ID under this case ? I prefer to keep it empty because the ID should be generated while logging model (and after model logged, the ID should be immutable)
There was a problem hiding this comment.
If we set new generated ID here, suppose we load a old version model twice, we will get 2 in-memory model with different model ID, this break the rule of model ID being immutable
There was a problem hiding this comment.
If we set new generated ID here, suppose we load a old version model twice, we will get 2 in-memory model with different model ID, this break the rule of model ID being immutable.
@WeichenXu123 If so, I think we need to fix the python code.
import tempfile
import os
from mlflow.models import Model
with tempfile.TemporaryDirectory() as tmp_dir:
path = os.path.join(tmp_dir, "MLmodel")
with open(path, "w") as f:
f.write(
"""
artifact_path: model
flavors:
python_function:
env: conda.yaml
loader_module: mlflow.sklearn
model_path: model.pkl
python_version: 3.7.9
sklearn:
pickled_model: model.pkl
serialization_format: cloudpickle
sklearn_version: 0.24.1
"""
)
print(Model.load(path).model_uuid)
print(Model.load(path).model_uuid)output:
cac277097e54410c8d191a717e640cf6
560075c0351f43dca8554f8ab6eb96f4
There was a problem hiding this comment.
Yes. I create a fixing PR for python side:
#5167
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy 17039389+harupy@users.noreply.github.com
What changes are proposed in this pull request?
Add
model_uuidto JavaModelas a follow-up for #5149.How is this patch tested?
Unit tests
Does this PR change the documentation?
ci/circleci: build_doccheck. If it's successful, proceed to thenext step, otherwise fix it.
Detailson the right to open the job page of CircleCI.Artifactstab.docs/build/html/index.html.Release Notes
Is this a user-facing change?
(Details in 1-2 sentences. You can just refer to another PR with a description if this PR is part of a larger change.)
What component(s), interfaces, languages, and integrations does this PR affect?
Components
area/artifacts: Artifact stores and artifact loggingarea/build: Build and test infrastructure for MLflowarea/docs: MLflow documentation pagesarea/examples: Example codearea/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registryarea/models: MLmodel format, model serialization/deserialization, flavorsarea/projects: MLproject format, project running backendsarea/scoring: MLflow Model server, model deployment tools, Spark UDFsarea/server-infra: MLflow Tracking server backendarea/tracking: Tracking Service, tracking client APIs, autologgingInterface
area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev serverarea/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Modelsarea/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registryarea/windows: Windows supportLanguage
language/r: R APIs and clientslanguage/java: Java APIs and clientslanguage/new: Proposals for new client languagesIntegrations
integrations/azure: Azure and Azure ML integrationsintegrations/sagemaker: SageMaker integrationsintegrations/databricks: Databricks integrationsHow should the PR be classified in the release notes? Choose one:
rn/breaking-change- The PR will be mentioned in the "Breaking Changes" sectionrn/none- No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" sectionrn/feature- A new user-facing feature worth mentioning in the release notesrn/bug-fix- A user-facing bug fix worth mentioning in the release notesrn/documentation- A user-facing documentation change worth mentioning in the release notes