AutoML is a lightweight library to create ML models in a data-centric AI way:
- Label on Kili
- Train a model with AutoML and evaluate its performance in one line of code
- Push predictions to Kili to accelerate the labeling in one line of code
- Prioritize labeling on Kili to label the data that will improve your model the most first
Iterate.
Once you are satisfied with the performance, in one line of code, serve the model and monitor the performance keeping a human in the loop with Kili.
git clone https://github.com/kili-technology/automl.git
cd automl
git submodule update --initthen
pip install -r requirements.txt -r utils/ultralytics/yolov5/requirements.txtWe made AutoML very simple to use. The main methods are:
python train.py \
--api-key $KILI_API_KEY \
--project-id $KILI_PROJECT_IDRetrieve the annotated data from the project and specialize the best model among the following ones on each task:
- Hugging Face (NER, Text Classification)
- YOLOv5 (Object Detection)
- spaCy (coming soon)
- Simple Transformers (coming soon)
- Catalyst (coming soon)
- XGBoost & LightGBM (coming soon)
Compute model loss to infer when you can stop labeling.
python predict.py \
--api-key $KILI_API_KEY \
--project-id $KILI_PROJECT_IDUse trained models to push pre-annotations onto unlabeled assets. Typically speeds up labeling by 10% with each iteration.
Where is the model confident or confused today?
python prioritize.py \
--api-key $KILI_API_KEY \
--project-id $KILI_PROJECT_ID
--sampling uncertainty
--method least-confidence-samplingHow can we sample the optimal unlabeled data points for human review?
python prioritize.py \
--api-key $KILI_API_KEY \
--project-id $KILI_PROJECT_ID
--sampling diversity
--method model-based-outlierNote: for image classfication projects only.
python label_errors.py \
--api-key $KILI_API_KEY \
--project-id $KILI_PROJECT_IDpython serve.py \
--api-key $KILI_API_KEY \
--project-id $KILI_PROJECT_IDServe trained models while pushing assets and predictions to Kili for continuous labeling. Allows monitoring the model drift.
AutoML is a utility library that trains and serves models. It is your responsibility to determine whether the model performance is high enough or not.
Don't hesitate to contribute!


