MLeap Flavor/Deployment #1: Java Support#307
Conversation
| * Represents an MLeap flavor configuration | ||
| */ | ||
| public class MLeapFlavor implements Flavor { | ||
| public static final String FLAVOR_NAME = "mleap"; |
There was a problem hiding this comment.
Google's Java Style Guide would have us do 2-space indent for all Java files.
There was a problem hiding this comment.
Formatted code using https://github.com/google/google-java-format
| @@ -0,0 +1,50 @@ | |||
| package com.databricks.mlflow.mleap; | |||
| import com.databricks.mlflow.sagemaker.PredictorLoadingException; | ||
| import com.databricks.mlflow.utils.FileUtils; | ||
|
|
||
| import java.util.Optional; |
There was a problem hiding this comment.
There was a problem hiding this comment.
Formatted code using https://github.com/google/google-java-format
| /** | ||
| * Unit tests for the {@link com.databricks.mlflow.utils.SerializationUtils} module | ||
| */ | ||
| public class SerializationUtilsTests { |
There was a problem hiding this comment.
There was a problem hiding this comment.
Updated all tests to end with Test instead of Tests.
| * | ||
| * @param modelRootPath The path to the root directory of the MLFlow model | ||
| */ | ||
| public static Model fromRootPath(String modelRootPath) throws IOException { |
There was a problem hiding this comment.
Can we add a test for this method?
There was a problem hiding this comment.
Added a test called ModelTests#testModelIsLoadedFromYamlUsingRootPathCorrectly
| Assert.assertTrue( | ||
| model.getFlavor(MLeapFlavor.FLAVOR_NAME, MLeapFlavor.class).isPresent()); | ||
| Assert.assertTrue(model.getUtcTimeCreated().isPresent()); | ||
| } catch (IOException e) { |
There was a problem hiding this comment.
nit: I think you're allowed to add throws to a test, and it will fail the test if the exception is thrown. Maybe see if it works.
There was a problem hiding this comment.
The exception handling is deliberate here, in order to provide additional information about the test failure. Added a stack trace dump to ensure that the original trace isn't lost.
| * @param class The class of the Java object that should be produced | ||
| */ | ||
| public static <T> T parseJsonFromFile(String filePath, Class<T> objectClass) | ||
| throws IOException { |
There was a problem hiding this comment.
hypernit: continuation indents are at +4 indentation (compared to default +2) to distinguish from body
There was a problem hiding this comment.
Formatted code using https://github.com/google/google-java-format
| * @param json A string in valid JSON format | ||
| */ | ||
| public static <T> T fromJson(String json) throws IOException { | ||
| return jsonMapper.readValue(json, new TypeReference<T>() {}); |
There was a problem hiding this comment.
I'm not sure I believe that this actually works for complex types. Do we need to define it, btw? Maybe we could remove the methods of SerializationUtils that are not currently used.
There was a problem hiding this comment.
This is used later by MLeap. See https://github.com/mlflow/mlflow/pull/278/files#diff-71d9fc3d16e2e46961159e9efccdf573R52.
Good catch, this didn't work for complex types. Added a Class argument.
Codecov Report
@@ Coverage Diff @@
## master #307 +/- ##
==========================================
+ Coverage 55.81% 55.94% +0.12%
==========================================
Files 108 108
Lines 5531 5554 +23
==========================================
+ Hits 3087 3107 +20
- Misses 2444 2447 +3
Continue to review full report at Codecov.
|
| * Loads an MLFlow model as a generic predictor that can be used for | ||
| * inference | ||
| * | ||
| * @throws {@link com.databricks.mlflow.sagemaker.PredictorLoadingException} if |
There was a problem hiding this comment.
Thanks! I've shortened links throughout the PR that correspond to imported modules.
| * inference | ||
| * | ||
| * @throws {@link com.databricks.mlflow.sagemaker.PredictorLoadingException} if | ||
| * failures are encountered while attempting to load the model from the specified configuration. |
There was a problem hiding this comment.
Thanks for the inline replacement. Done!
| */ | ||
| public Predictor load(String modelRootPath) throws PredictorLoadingException { | ||
| try { | ||
| return load(Model.fromRootPath(modelRootPath)); |
| @@ -0,0 +1,50 @@ | |||
| package com.databricks.mlflow.mleap; | |||
| * | ||
| * @param modelRootPath The path to the root directory of the MLFlow model | ||
| */ | ||
| public static Model fromRootPath(String modelRootPath) throws IOException { |
There was a problem hiding this comment.
Added a test called ModelTests#testModelIsLoadedFromYamlUsingRootPathCorrectly
| /** | ||
| * Unit tests for the {@link com.databricks.mlflow.utils.SerializationUtils} module | ||
| */ | ||
| public class SerializationUtilsTests { |
There was a problem hiding this comment.
Updated all tests to end with Test instead of Tests.
| * @param json A string in valid JSON format | ||
| */ | ||
| public static <T> T fromJson(String json) throws IOException { | ||
| return jsonMapper.readValue(json, new TypeReference<T>() {}); |
There was a problem hiding this comment.
This is used later by MLeap. See https://github.com/mlflow/mlflow/pull/278/files#diff-71d9fc3d16e2e46961159e9efccdf573R52.
Good catch, this didn't work for complex types. Added a Class argument.
| * @param class The class of the Java object that should be produced | ||
| */ | ||
| public static <T> T parseJsonFromFile(String filePath, Class<T> objectClass) | ||
| throws IOException { |
There was a problem hiding this comment.
Formatted code using https://github.com/google/google-java-format
| * Represents an MLeap flavor configuration | ||
| */ | ||
| public class MLeapFlavor implements Flavor { | ||
| public static final String FLAVOR_NAME = "mleap"; |
There was a problem hiding this comment.
Formatted code using https://github.com/google/google-java-format
| import com.databricks.mlflow.sagemaker.PredictorLoadingException; | ||
| import com.databricks.mlflow.utils.FileUtils; | ||
|
|
||
| import java.util.Optional; |
There was a problem hiding this comment.
Formatted code using https://github.com/google/google-java-format
| @@ -0,0 +1,13 @@ | |||
| package org.mlflow; | |||
There was a problem hiding this comment.
Can we remove the .swn files? What are those? Should we add them to a gitignore?
|
LGTM, just one comment |
| <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> | ||
| <modelVersion>4.0.0</modelVersion> | ||
| <groupId>org.mlflow</groupId> | ||
| <artifactId>mlflow-java</artifactId> |
There was a problem hiding this comment.
Kind of a nit but shouldn't we just call it mlflow? I guess we can rename it later too.
In the long run we'll probably want this to be a separate artifact from the Java API client anyway, so we'll need to make a more complicated pom.xml.
There was a problem hiding this comment.
@mateiz Maven's package naming conventions, which are derived from Java conventions, indicate that a package name should correspond to a valid domain that the package owner controls. Since we own mlflow.org, org.mlflow seems to be the correct package name.
Here's the Java reference I found regarding package naming: https://docs.oracle.com/javase/specs/jls/se10/jls10.pdf (Page 137)
There was a problem hiding this comment.
Ah, I wasn't asking about the org.mlflow line, but about the mlflow-java artifact ID. Why not just mlflow.
There was a problem hiding this comment.
Ah, sorry! I'll correct this in the followup PR that adds SageMaker support to the Java library!
This is the first PR in a multi-part series that will introduce support for the MLeap flavor in MLFlow. This PR provides the infrastructure for deserializing and invoking models as generic predictors in Java.