[ML][Inference] Adding inference ingest processor by benwtrent · Pull Request #47859 · elastic/elasticsearch

benwtrent · 2019-10-10T13:05:18Z

This adds a new Ingest processor that does infers against a stored previously trained model.

elasticmachine · 2019-10-10T13:05:20Z

Pinging @elastic/ml-core (:ml)

benwtrent · 2019-10-10T13:06:50Z

x-pack/plugin/ml/src/main/java/org/elasticsearch/xpack/ml/MachineLearning.java

                MIN_DISK_SPACE_OFF_HEAP,
-                MlConfigMigrationEligibilityCheck.ENABLE_CONFIG_MIGRATION);
+                MlConfigMigrationEligibilityCheck.ENABLE_CONFIG_MIGRATION,
+                InferenceProcessor.MAX_INFERENCE_PROCESSORS


There may be one more setting we should add "Maximum loaded models". But, I think that we don't need to add this other setting until we support loading models outside of processors.

benwtrent · 2019-10-10T13:07:29Z

.../plugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/ingest/InferenceProcessor.java


+    // How many total inference processors are allowed to be used in the cluster.
+    public static final Setting<Integer> MAX_INFERENCE_PROCESSORS = Setting.intSetting("xpack.ml.max_inference_processors",
+        50,


this is a "magic" number. Given we don't have real data around our typical model size and performance, I just picked a number that is not too small.

benwtrent · 2019-10-10T13:09:32Z

.../plugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/ingest/InferenceProcessor.java

+            String targetField = ConfigurationUtils.readStringProperty(TYPE, tag, config, TARGET_FIELD);
+            Map<String, String> fieldMapping = ConfigurationUtils.readOptionalMap(TYPE, tag, config, FIELD_MAPPINGS);
+            InferenceConfig inferenceConfig = inferenceConfigFromMap(ConfigurationUtils.readMap(TYPE, tag, config, INFERENCE_CONFIG));
+            String modelInfoField = ConfigurationUtils.readStringProperty(TYPE, tag, config, MODEL_INFO_FIELD, "_model_info");


By default, I think we should add a field that includes the model ID used in the inference step.

This should probably append the tag to the end of the default info field to protect against multiple processors in the same pipeline...

benwtrent · 2019-10-10T13:11:20Z

.../plugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/ingest/InferenceProcessor.java

+            }
+        }
+
+        void checkSupportedVersion(InferenceConfig config) {


Initially, this factory builds the pipeline on the master node before storing it in cluster state. So, if we always check the minimum supported version against the minimum node version we can guarantee that PUT pipeline will fail if a user tries to create a pipeline when the specific processor setting is not supported.

I think this explanation should go into code comment for method getMinimalSupportedVersion.

…ture/ml-inference-processor

...e/src/main/java/org/elasticsearch/xpack/core/ml/inference/trainedmodel/RegressionConfig.java

przemekwitek · 2019-10-16T13:37:22Z

.../plugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/ingest/InferenceProcessor.java

+    }
+
+    void mutateDocument(InferModelAction.Response response, IngestDocument ingestDocument) {
+        response.getInferenceResults().get(0).writeResult(ingestDocument, this.targetField);


Is it guaranteed that response.getInferenceResults() is non-empty?

Good catch, I will put a check to protect us from funkiness.

.../plugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/ingest/InferenceProcessor.java

przemekwitek · 2019-10-21T08:40:51Z

...multi-node-tests/src/test/java/org/elasticsearch/xpack/ml/integration/InferenceIngestIT.java

+                XContentType.JSON).get().isAcknowledged(), is(true));
+
+            client().prepareIndex("index_for_inference_test", "_doc")
+                .setSource(new HashMap<>(){{


I would put source doc generation into a method, say, "generateDocSource".

przemekwitek · 2019-10-21T08:44:16Z

...multi-node-tests/src/test/java/org/elasticsearch/xpack/ml/integration/InferenceIngestIT.java

+    }
+
+    public void testSimulate() {
+        String source = "{\n" +


[non-actionable] Waiting for multiline raw string literals being introduced to Java...

No joke!

""" """

cannot be added soon enough!

przemekwitek · 2019-10-21T08:49:25Z

.../plugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/ingest/InferenceProcessor.java

+        InferenceConfig inferenceConfigFromMap(Map<String, Object> inferenceConfig) throws IOException {
+            ExceptionsHelper.requireNonNull(inferenceConfig, INFERENCE_CONFIG);
+
+            if (inferenceConfig.keySet().size() != 1) {


Is it equivalent to inferenceConfig.size()? It should be, right?

definitely, I can simplify.

.../plugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/ingest/InferenceProcessor.java

przemekwitek · 2019-10-21T08:54:50Z

...rc/test/java/org/elasticsearch/xpack/ml/inference/ingest/InferenceProcessorFactoryTests.java

+        }
+    }
+
+    private static ClusterState buildState(MetaData metaData) {


Rename to buildClusterState for consistency with the methods below?

przemekwitek

LGTM

przemekwitek · 2019-10-21T11:40:02Z

...re/src/main/java/org/elasticsearch/xpack/core/ml/inference/trainedmodel/InferenceConfig.java


    boolean isTargetTypeSupported(TargetType targetType);

+    Version getMinimalSupportedVersion();


Could you add a comment explaining the need for this?

przemekwitek · 2019-10-21T11:43:19Z

.../plugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/ingest/InferenceProcessor.java

+            }
+        }
+
+        void checkSupportedVersion(InferenceConfig config) {


I think this explanation should go into code comment for method getMinimalSupportedVersion.

…ture/ml-inference-processor

* [ML][Inference] adds lazy model loader and inference (#47410) This adds a couple of things: - A model loader service that is accessible via transport calls. This service will load in models and cache them. They will stay loaded until a processor no longer references them - A Model class and its first sub-class LocalModel. Used to cache model information and run inference. - Transport action and handler for requests to infer against a local model Related Feature PRs: * [ML][Inference] Adjust inference configuration option API (#47812) * [ML][Inference] adds logistic_regression output aggregator (#48075) * [ML][Inference] Adding read/del trained models (#47882) * [ML][Inference] Adding inference ingest processor (#47859) * [ML][Inference] fixing classification inference for ensemble (#48463) * [ML][Inference] Adding model memory estimations (#48323) * [ML][Inference] adding more options to inference processor (#48545) * [ML][Inference] handle string values better in feature extraction (#48584) * [ML][Inference] Adding _stats endpoint for inference (#48492) * [ML][Inference] add inference processors and trained models to usage (#47869) * [ML][Inference] add new flag for optionally including model definition (#48718) * [ML][Inference] adding license checks (#49056) * [ML][Inference] Adding memory and compute estimates to inference (#48955)

* [ML][Inference] adds lazy model loader and inference (elastic#47410) This adds a couple of things: - A model loader service that is accessible via transport calls. This service will load in models and cache them. They will stay loaded until a processor no longer references them - A Model class and its first sub-class LocalModel. Used to cache model information and run inference. - Transport action and handler for requests to infer against a local model Related Feature PRs: * [ML][Inference] Adjust inference configuration option API (elastic#47812) * [ML][Inference] adds logistic_regression output aggregator (elastic#48075) * [ML][Inference] Adding read/del trained models (elastic#47882) * [ML][Inference] Adding inference ingest processor (elastic#47859) * [ML][Inference] fixing classification inference for ensemble (elastic#48463) * [ML][Inference] Adding model memory estimations (elastic#48323) * [ML][Inference] adding more options to inference processor (elastic#48545) * [ML][Inference] handle string values better in feature extraction (elastic#48584) * [ML][Inference] Adding _stats endpoint for inference (elastic#48492) * [ML][Inference] add inference processors and trained models to usage (elastic#47869) * [ML][Inference] add new flag for optionally including model definition (elastic#48718) * [ML][Inference] adding license checks (elastic#49056) * [ML][Inference] Adding memory and compute estimates to inference (elastic#48955)

* [ML] ML Model Inference Ingest Processor (#49052) * [ML][Inference] adds lazy model loader and inference (#47410) This adds a couple of things: - A model loader service that is accessible via transport calls. This service will load in models and cache them. They will stay loaded until a processor no longer references them - A Model class and its first sub-class LocalModel. Used to cache model information and run inference. - Transport action and handler for requests to infer against a local model Related Feature PRs: * [ML][Inference] Adjust inference configuration option API (#47812) * [ML][Inference] adds logistic_regression output aggregator (#48075) * [ML][Inference] Adding read/del trained models (#47882) * [ML][Inference] Adding inference ingest processor (#47859) * [ML][Inference] fixing classification inference for ensemble (#48463) * [ML][Inference] Adding model memory estimations (#48323) * [ML][Inference] adding more options to inference processor (#48545) * [ML][Inference] handle string values better in feature extraction (#48584) * [ML][Inference] Adding _stats endpoint for inference (#48492) * [ML][Inference] add inference processors and trained models to usage (#47869) * [ML][Inference] add new flag for optionally including model definition (#48718) * [ML][Inference] adding license checks (#49056) * [ML][Inference] Adding memory and compute estimates to inference (#48955) * fixing version of indexed docs for model inference

[ML][Inference] Adding ingest processor

c7b5b53

benwtrent added >non-issue :ml Machine learning v8.0.0 labels Oct 10, 2019

benwtrent commented Oct 10, 2019

View reviewed changes

benwtrent added 3 commits October 10, 2019 09:27

optionally including tag in model metadata injection in processor

13661cf

Merge remote-tracking branch 'upstream/feature/ml-inference' into fea…

5d399c8

…ture/ml-inference-processor

Merge branch 'feature/ml-inference' into feature/ml-inference-processor

0609c24

benwtrent mentioned this pull request Oct 16, 2019

[ML][Inference] Adding read/del trained models #47882

Merged

benwtrent added 2 commits October 16, 2019 14:36

Merge branch 'feature/ml-inference' into feature/ml-inference-processor

bbd6e24

fixing test

62ec0cc

droberts195 added the v7.6.0 label Oct 17, 2019

Merge branch 'feature/ml-inference' into feature/ml-inference-processor

69acaa3

przemekwitek reviewed Oct 21, 2019

View reviewed changes

addressing PR comments

67ca805

przemekwitek approved these changes Oct 21, 2019

View reviewed changes

benwtrent added 2 commits October 21, 2019 07:54

Merge remote-tracking branch 'upstream/feature/ml-inference' into fea…

a2d4114

…ture/ml-inference-processor

adding comment

40957e8

benwtrent merged commit d839e6b into elastic:feature/ml-inference Oct 21, 2019

benwtrent deleted the feature/ml-inference-processor branch October 21, 2019 12:55

droberts195 mentioned this pull request Nov 13, 2019

[ML] Inference Processor #46135

Closed

benwtrent mentioned this pull request Nov 18, 2019

[7.x][ML] ML Model Inference Ingest Processor (#49052) #49257

Merged

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021


		boolean isTargetTypeSupported(TargetType targetType);

		Version getMinimalSupportedVersion();

Conversation

benwtrent commented Oct 10, 2019

Uh oh!

elasticmachine commented Oct 10, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

przemekwitek left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants