The initial focus for adding ML components to integration packages will be including anomaly detection job configurations. However the second phase will look at adding assets for pre-trained models, such as classification models for detecting DGA (domain generating algorithms) domains for security data.
The assets would include an ingest pipeline, and the pre-trained model, which typically range in size from 10s of MBs up to several GBs. There might also be an associated data stream, for example security data.
Due to its potentially very large size, the pre-trained model should be downloaded and installed on demand, separate to the dependent package. The model could just be downloaded via a link, rather than being part of a package.
The trained model will have a license type, but this is not contained with the model schema. Currently a platinum license is required to use the create trained model API, so the user should not be able to deploy the model if they don’t have this license type.
It is likely that models will be updated over time, and the user should be able to upgrade the model. Models will be versioned, and this may be different to the version used for the dependent package. If possible, we should look to fit in with the current package manager upgrade mechanism.
We are looking at a time frame of 8.0 for this second phase of the ML - integration packages work.
The initial focus for adding ML components to integration packages will be including anomaly detection job configurations. However the second phase will look at adding assets for pre-trained models, such as classification models for detecting DGA (domain generating algorithms) domains for security data.
The assets would include an ingest pipeline, and the pre-trained model, which typically range in size from 10s of MBs up to several GBs. There might also be an associated data stream, for example security data.
Due to its potentially very large size, the pre-trained model should be downloaded and installed on demand, separate to the dependent package. The model could just be downloaded via a link, rather than being part of a package.
The trained model will have a license type, but this is not contained with the model schema. Currently a platinum license is required to use the create trained model API, so the user should not be able to deploy the model if they don’t have this license type.
It is likely that models will be updated over time, and the user should be able to upgrade the model. Models will be versioned, and this may be different to the version used for the dependent package. If possible, we should look to fit in with the current package manager upgrade mechanism.
We are looking at a time frame of 8.0 for this second phase of the ML - integration packages work.