[ML] Add new include flag to GET inference/<model_id> API for model training metadata#61922
Conversation
ece8df8 to
eacc213
Compare
082aff6 to
19b0eb6
Compare
19b0eb6 to
bfb390e
Compare
docs/java-rest/high-level/ml/get-trained-models-metadata.asciidoc
Outdated
Show resolved
Hide resolved
docs/java-rest/high-level/ml/get-trained-models-metadata.asciidoc
Outdated
Show resolved
Hide resolved
docs/java-rest/high-level/ml/get-trained-models-metadata.asciidoc
Outdated
Show resolved
Hide resolved
docs/java-rest/high-level/ml/get-trained-models-metadata.asciidoc
Outdated
Show resolved
Hide resolved
docs/java-rest/high-level/ml/get-trained-models-metadata.asciidoc
Outdated
Show resolved
Hide resolved
docs/reference/ml/df-analytics/apis/get-inference-trained-model-metadata.asciidoc
Outdated
Show resolved
Hide resolved
docs/reference/ml/df-analytics/apis/get-inference-trained-model-metadata.asciidoc
Outdated
Show resolved
Hide resolved
docs/reference/ml/df-analytics/apis/get-inference-trained-model-metadata.asciidoc
Outdated
Show resolved
Hide resolved
docs/reference/ml/df-analytics/apis/get-inference-trained-model-metadata.asciidoc
Outdated
Show resolved
Hide resolved
docs/reference/ml/df-analytics/apis/get-inference-trained-model-metadata.asciidoc
Outdated
Show resolved
Hide resolved
Co-authored-by: Lisa Cawley <lcawley@elastic.co>
|
@elasticmachine update branch |
lcawl
left a comment
There was a problem hiding this comment.
Added three more suggestions, otherwise LGTM
|
|
||
| `from`:: | ||
| (Optional, integer) | ||
| include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=from] |
There was a problem hiding this comment.
Sorry I didn't notice this sooner, but these descriptions have the same problem--they're job-specific. I've created model-specific descriptions in #62128
| include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=from] | |
| include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=from-models] |
|
|
||
| `size`:: | ||
| (Optional, integer) | ||
| include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=size] |
There was a problem hiding this comment.
Ditto re needing a model-specific description:
| include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=size] | |
| include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=size-models] |
| } | ||
| }, | ||
| { | ||
| "feature_name" : "FlightTimeMin", |
There was a problem hiding this comment.
This is a very long example, so it's worth considering putting "..." in a couple of places where the info is redundant.
18f2a16 to
4e4a198
Compare
|
Pinging @elastic/ml-core (:ml) |
lcawl
left a comment
There was a problem hiding this comment.
This PR has changed a lot since my last review, so I've added more suggestions.
| <4> Indicate if the total feature importance for the features used in training | ||
| should be included in the model `metadata` field. | ||
| <5> Should the definition be fully decompressed on GET | ||
| <6> Allow empty response if no Trained Models match the provided ID patterns. |
There was a problem hiding this comment.
| <6> Allow empty response if no Trained Models match the provided ID patterns. | |
| <6> Allow empty response if no trained models match the provided ID patterns. |
| If false, an error will be thrown if no Trained Models match the | ||
| ID patterns. | ||
| <6> An optional list of tags used to narrow the model search. A Trained Model | ||
| <7> An optional list of tags used to narrow the model search. A Trained Model |
There was a problem hiding this comment.
| <7> An optional list of tags used to narrow the model search. A Trained Model | |
| <7> An optional list of tags used to narrow the model search. A trained model |
| ===== | ||
| `total_feature_importance`::: | ||
| (array) | ||
| An array of the total feature importance for each training feature used from |
There was a problem hiding this comment.
An array of the total feature importance for each training feature used from
AFAIK we've never defined a "training feature". Can we just say feature? If not, this term needs to be explained.
| end::inference-config-results-field-processor[] | ||
|
|
||
| tag::inference-metadata-feature-importance-feature-name[] | ||
| The training feature name for which this importance was calculated. |
There was a problem hiding this comment.
training feature name
Ditto earlier comment about "training feature".
| The training feature name for which this importance was calculated. | |
| The feature for which this importance was calculated. |
| (Optional, string) | ||
| A comma delimited string of optional fields to include in the response body. | ||
| Valid options are: | ||
| - definition: to include the model definition |
There was a problem hiding this comment.
| - definition: to include the model definition | |
| - `definition`: Includes the model definition |
| A comma delimited string of optional fields to include in the response body. | ||
| Valid options are: | ||
| - definition: to include the model definition | ||
| - total_feature_importance: to include the total feature importance for the |
There was a problem hiding this comment.
| - total_feature_importance: to include the total feature importance for the | |
| - `total_feature_importance`: Includes the total feature importance for the |
| Valid options are: | ||
| - definition: to include the model definition | ||
| - total_feature_importance: to include the total feature importance for the | ||
| training feature sets. This field will be available in the `metadata` field. |
There was a problem hiding this comment.
| training feature sets. This field will be available in the `metadata` field. | |
| training data sets. This field is available in the `metadata` field in the response body. |
| can have many tags or none. The trained models in the response will | ||
| contain all the provided tags. | ||
| <7> Optional boolean value indicating if certain fields should be removed on | ||
| <8> Optional boolean value indicating if certain fields should be removed on |
There was a problem hiding this comment.
Swap these 2 sentences around, the fact that certain fields are removed is the detail
<8> Optional boolean value for requesting the trained model in a format that can then be put into another cluster. Certain fields that can only be set when the model is imported are removed.
| this.description = config.getDescription(); | ||
| this.tags = config.getTags(); | ||
| this.metadata = config.getMetadata(); | ||
| this.metadata = config.getMetadata() == null ? null : new HashMap<>(config.getMetadata()); |
There was a problem hiding this comment.
The package private ctor also wraps this map in line 151
this.metadata = metadata == null ? null : Collections.unmodifiableMap(metadata);
Make the ctor argument a UnmodifiableMap and getMetadata() return an UnmodifiableMap and I think that simplifies these maps being wrapped in a map in a map etc
| } | ||
| if (this.metadata == null) { | ||
| this.metadata = new HashMap<>(); | ||
| } |
There was a problem hiding this comment.
Make a copy of the unmodifiable map here
| builder = handleSearchItem(multiSearchResponse.getResponses()[0], modelId, this::parseInferenceDocLenientlyFromSource); | ||
| } catch (ResourceNotFoundException ex) { | ||
| listener.onFailure(new ResourceNotFoundException( | ||
| getTrainedModelListener.onFailure(new ResourceNotFoundException( |
There was a problem hiding this comment.
nit: It is clearer if finalListener is used for these onFailure calls.
| if (includeTotalFeatureImportance == false) { | ||
| finalListener.onResponse(modelBuilders.stream() | ||
| .map(TrainedModelConfig.Builder::build) | ||
| .sorted(Comparator.comparing(TrainedModelConfig::getModelId)) |
There was a problem hiding this comment.
Configs are still sorted by the search aren't they? Sorting here shouldn't be necessary
| .sorted(Comparator.comparing(TrainedModelConfig::getModelId)) | ||
| .collect(Collectors.toList())); | ||
| return; | ||
|
|
| .collect(Collectors.toList())), | ||
| failure -> { | ||
| // total feature importance is not necessary for a model to be valid | ||
| // we shouldn't fail if it is not found |
There was a problem hiding this comment.
👍 makes sense
Is it the case that newer models after version 7.10 will always have the total feature importance?
| 'ml/inference_crud/Test put ensemble with tree where tree has out of bounds feature_names index', | ||
| 'ml/inference_crud/Test put model with empty input.field_names', | ||
| 'ml/inference_crud/Test PUT model where target type and inference config mismatch', | ||
| 'ml/inference_metadata/Test get given missing trained model metadata', |
There was a problem hiding this comment.
I can't find a inference_metadata.yml did you forget to add it?
|
run elasticsearch-ci/2 |
| An object containing metadata about the trained model. For example, models | ||
| An object containing metadata about the trained model. For example, models | ||
| created by {dfanalytics} contain `analysis_config` and `input` objects. | ||
| .Properties of metadata |
There was a problem hiding this comment.
This isn't formatting properly, so I think a continuation character is required.
| .Properties of metadata | |
| + | |
| .Properties of metadata |
…raining metadata (elastic#61922) Adds new flag include to the get trained models API The flag initially has two valid values: definition, total_feature_importance. Consequently, the old include_model_definition flag is now deprecated. When total_feature_importance is included, the total_feature_importance field is included in the model metadata object. Including definition is the same as previously setting include_model_definition=true.
…odel training metadata (#61922) (#62620) * [ML] Add new include flag to GET inference/<model_id> API for model training metadata (#61922) Adds new flag include to the get trained models API The flag initially has two valid values: definition, total_feature_importance. Consequently, the old include_model_definition flag is now deprecated. When total_feature_importance is included, the total_feature_importance field is included in the model metadata object. Including definition is the same as previously setting include_model_definition=true. * fixing test * Update x-pack/plugin/core/src/test/java/org/elasticsearch/xpack/core/ml/action/GetTrainedModelsRequestTests.java
Adds new flag
includeto the get trained models APIThe flag initially has two valid values: definition, total_feature_importance.
Consequently, the old
include_model_definitionflag is now deprecated.When
total_feature_importanceis included, thetotal_feature_importancefieldis included in the model
metadataobject.Including
definitionis the same as previously settinginclude_model_definition=true.