[ML] Merge the pytorch-inference feature branch#73660
Conversation
Initial start/stop trained model deployment actions.
Adds the model_type field to TrainedModelConfig for distinguishing between models that can be loaded via the model loading service and those that require a native process.
This adds a temporary API for doing inference against a trained model deployment.
Introduces code for re-assembling the individual chunks a model is stored in and streaming those chunks to the inference process. Re-uses the TrainedModelDefinitionDoc format already defined for boosted tree models
Binary data is stored in lucene base64 encoded, the same data stored in a Java string uses 2 bytes (UTF16) to represent each base64 character consuming twice the amount of memory required. The compressed binary representation of the models can stored in ByteReferences more efficiently. For BWC a new field mapping binary_definition is added .ml-inference-* and the index version incremented.
This adds a location field to TrainedModelConfig for large models that cannot be PUT inline with the config. Large models are reassembled from their location.
Adds tokenisation for BERT models via the WordPiece algorithm using the vocabulary that defined with the model and introduces the concept of NLP tasks. Each task is configured with a BERT model supporting that task, pre-processing and post-processing is defined by the task. Named Entity Recognition and Fill Mask are the 2 task types supported by this PR
|
Pinging @elastic/ml-core (Team:ML) |
|
Pinging @elastic/clients-team (Team:Clients) |
sethmlarson
left a comment
There was a problem hiding this comment.
Early review comments, very excited for this functionality.
rest-api-spec/src/main/resources/rest-api-spec/api/ml.start_deployment.json
Outdated
Show resolved
Hide resolved
rest-api-spec/src/main/resources/rest-api-spec/api/ml.start_deployment.json
Outdated
Show resolved
Hide resolved
rest-api-spec/src/main/resources/rest-api-spec/api/ml.stop_deployment.json
Outdated
Show resolved
Hide resolved
|
Thanks for jumping in with an early review @sethmlarson
👍 This makes sense to me I'll raise it with the team I've missed out the spec of the These APIs may be in flux for a short while as we work through all the use cases. Is that a problem for the clients team? Would you prefer us to tell you when we settled on something we like? |
|
@davidkyle It's no problem for us that these APIs may change especially if they're experimental/on |
sethmlarson
left a comment
There was a problem hiding this comment.
Looks good from an API spec perspective 🎉 One comment I was unsure about.
The feature branch contains changes to configure PyTorch models with a
TrainedModelConfigand defines a format to store the binary models. The_startand_stopdeployment actions control the model lifecycle and the model can be directly evaluated with the_inferendpoint. 2 Types of NLP tasks are supported: Named Entity Recognition and Fill Mask.The feature branch consists of these PRs: