We take our custom trained model to the cloud/on-premises to scalably serve from tens to millions of requests. We explore Flask, Google Cloud ML Engine, TensorFlow Serving, and KubeFlow, showcasing the effort, scenario, and cost-benefit analysis. Continue reading online.
Go through the code in the following order:
- hello.py: Get a Flask server up and running. We will need to install Flask using
pip install flask. - infer.py: Run a simple web application using Flask to serve image classification requests with a Keras model.
- h5_to_pb.ipynb: Convert a pretrained Keras model to a format that is compatible with Google Cloud ML Engine and TensorFlow serving.
- image-to-json.py: This will produce the
request.jsonwhich is the image format accepted by Google Cloud ML Engine and TensorFlow serving. A samplerequest.jsonis also provided of the provided sample dog image is also provided.
Please update the path of the h5 model in ADD_H5_MODEL_PATH, and the desired location and model name in ADD_PATH_OF_PB_MODEL.
Use the sample images and models from previous chapters.
If not already present on our machines, we can download and install the Google Cloud SDK from the installation website here: https://cloud.google.com/sdk/install.