Using Auto Encoders for image compression
This project mainly follows the Tensorflow Autoencoders Tutorial (More info at: https://www.tensorflow.org/tutorials/generative/autoencoder)
The notebook contains the steps taken to compress an 28x28 size image to a 7x7 size array which occupies roughly 0.5x space occupied by the original images
-
Datatransfer - Video calls or movie streaming in low bandwidth conditions
-
Datastorage - storing data, especially that which is not retrieved daily and where resolution isn't an issue
The dataset I used for this project is the Flickr-Faces-HQ Dataset (FFHQ) and all the images produced by me as test results are published under the Creative Commons BY-NC-SA 4.0 License
Flickr-Faces-HQ (FFHQ) is a high-quality image dataset of human faces, originally created as a benchmark for generative adversarial networks (GAN):
A Style-Based Generator Architecture for Generative Adversarial Networks
Tero Karras (NVIDIA), Samuli Laine (NVIDIA), Timo Aila (NVIDIA)
https://arxiv.org/abs/1812.04948
To use pretrained autoencoder model, refer to "Load Pretrained Model and Run Inference" section in Notebook file"
The Notebook file AutoEncoderImageCompression.ipynb contains steps from loading dataset to training an auto encoder to reproduce the below results.
The AutoEncoderImageCompression.ipynb notebook is run as a Google Colab Notebook
-
To create the dataset use the github repository of Flickr-Faces-HQ Dataset (FFHQ) which provides a script to download all the images, I used it to download only the thumbnail (128x128) images
-
All the images files are then collected to a single folder using the
collect_dataset_from_sub_dirs.pyfile -
The image files are then converted to grayscale and resized to 28x28 size images using the
resize_and_save_gray.pyfile
The 28x28 size input image is converted to a embedding matrix vector which is then passed through a decoder to obtain the reconstructed images.
The compression ratio is about 2.0x
The 128x128 size input image is converted to a embedding matrix which is then passed through a decoder to obtain the reconstructed images.
The compression ratio is about 4.0x

