This project demonstrates a complete MLOps pipeline for citrus leaf disease classification using Kubeflow on GCP. The system covers model training, evaluation, and deployment via a Flask-based inference service running in a Kubernetes cluster.
.
├── README.md # Project overview and documentation
├── inference/ # Flask app for inference (Dockerized)
│ ├── Dockerfile
│ ├── app.py
│ ├── requirements.txt
│ └── templates/
├── jupiter_notebooks
│ ├── CNN.ipynb # Jupiter notebook for CNN
│ └── MobileNetV2.ipynb # Jupiter notebook for MobileNetV2
├── manifests/ # Kubeflow manifests for installation
│ ├── LICENSE, README.md, etc.
│ ├── apps/, common/, scripts/, etc.
├── pipeline_code/ # Kubeflow pipeline components and compiler code
│ ├── kubeflow-pipeline-cnn.py # Pipeline that trains a CNN model
│ ├── kubeflow-pipeline-save.py # Extended pipeline with model saving
│ ├── citrus_pipeline.yaml # Compiled pipeline yaml file
│ ├── citrus_pipeline-save.yaml # Compiled pipeline with model saving yaml file
│ └── requirements.txt
└── service_yaml_files/ # Kubernetes service configurations
├── lb.yaml # Load balancer for inference service
├── inference-deployment.yaml # Deployment for Flask inference pod
├── svc-disk.yaml # PVC service configuration
├── citrus-shell.yaml # Shell job/service
└── default_pod_config.yaml # Pod config to mount PVC in KubeflowUse the files under manifests/ to install Kubeflow on a GCP Standard GKE cluster.
Navigate to pipeline_code/ and compile the pipeline:
python3 kubeflow-pipeline-cnn.py # For training-only version
python3 kubeflow-pipeline-save.py # For version with model savingUpload the resulting YAML files to the Kubeflow Pipelines UI.
Build and deploy the Flask-based inference service:
cd inference/
docker build -t image_name .Apply the Kubernetes configs from service_yaml_files/:
kubectl apply -f inference-deployment.yaml
kubectl apply -f lb.yamlThe jupyter_notebooks/ directory contains exploratory and prototyping notebooks:
CNN.ipynb: Implements a basic convolutional neural network for citrus leaf disease classification.MobileNetV2.ipynb: Uses a transfer learning approach leveraging the MobileNetV2 architecture for improved accuracy and training efficiency.
These notebooks were used to prototype models before integrating them into the Kubeflow pipeline.
default_pod_config.yamlensures all Kubeflow components have access to the mounted PVC.svc-disk.yamldefines the Persistent Volume Claim for saving model artifacts.lb.yamlexposes the Flask app externally using a LoadBalancer service.- The trained model is stored in the PVC and read by the inference pod at runtime.
- Python 3.12.0
- TensorFlow 2.11
- Kubeflow Pipelines SDK V2 (2.7.0)
- Kubernetes (GKE Standard Cluster)
- Flask