[ML] Load and evaluate 3rd Party Model#72218
[ML] Load and evaluate 3rd Party Model#72218davidkyle merged 11 commits intoelastic:feature/pytorch-inferencefrom
Conversation
|
retest this please |
4423b8c to
e5c5bec
Compare
|
Pinging @elastic/ml-core (Team:ML) |
e5c5bec to
db8a7aa
Compare
benwtrent
left a comment
There was a problem hiding this comment.
location should be a named object.
The initial one is called index with a default location indicating the ml inference index, this way the logic for restoring models can be slightly more unified, we should always restore a definition from a location object and each location type knows how query for docs and provide them.
Example:
"location": {"index": {"name": ".ml-inference-*"}} // I am not sure if we should include model id
^ The default location for current models (if not shipped in resource files)
Also, within the index location type, we shouldn't supply the model_id. The model id should ALWAYS be the same as the model config. If we allow them to be different, we will needlessly complicate matters.
Is the idea that we could then support a future location that is not an index? |
Yes, the hope is that one day Elasticsearch will have a large binary blob storage facility (several people have requested this), and we'll use that for models added after that time. |
|
I like the idea that all trained model configurations have a location so conceptually there is little difference between the large models uploaded externally and the smaller DFA models. For existing models the location is
In this case the model id is the name of the PyTorch model and may be different for a few reasons
I'm not sure 2. is valid use case as the main option to tweak is the input field and that can be achieved with a field map. Let's keep this open for review because I love to prune redundant configuration and will do so if we can't find a use for it once we have more experience.
++ I'm not convinced we will ever have another storage option but the proposed config allows flexibility and is no more or less readable than the current implementation. |
2dc5158 to
0131351
Compare
0131351 to
4165dc3
Compare
| if (hasModelDefinition) { | ||
| trainedModelConfig.setEstimatedHeapMemory(config.getModelDefinition().ramBytesUsed()) | ||
| .setEstimatedOperations(config.getModelDefinition().getTrainedModel().estimatedNumOperations()); | ||
| } |
There was a problem hiding this comment.
🤔
we will have to rethink this. It would be good to somehow indicate the work and size required for a pytorch model.
There was a problem hiding this comment.
++ we will need a measure of the model size in TrainedModelConfig to allocate the persistent task by memory
|
|
||
|
|
||
| TrainedModelConfig trainedModelConfig = getModelResponse.getResources().results().get(0); | ||
| if (trainedModelConfig.getModelType() != TrainedModelType.PYTORCH) { |
There was a problem hiding this comment.
I am assuming that this is just for this development spike. Before merging we will allow deploying of all models (I would think)
There was a problem hiding this comment.
I thought it was a sane check to perform right now but may not make sense in the future
The feature branch contains changes to configure PyTorch models with a TrainedModelConfig and defines a format to store the binary models. The _start and _stop deployment actions control the model lifecycle and the model can be directly evaluated with the _infer endpoint. 2 Types of NLP tasks are supported: Named Entity Recognition and Fill Mask. The feature branch consists of these PRs: #73523, #72218, #71679 #71323, #71035, #71177, #70713
This adds a
locationfield toTrainedModelConfigfor large models that cannot be PUT inline with the config. If location is set then the definition is not required, instead the model will be loaded from the name and index specified in the location object following the convention used for restoringTrainedModelDefinitionDocs. I've deleted the classPyTorchModelwhich was the previous attempt to achieve the same by putting the model location in the definition but the model definition is not as easy to read as it needs to be explicitly requested and having the location in the config makes more sense both in code and API usage. When a model deployment starts the model is loaded from the location in the config.Example model configuration
The
_inferaction has been updated to accept a JSON document which is passes straight through to the inference request without any validation. This is a temporary measure more thought needs to go into the API design