DIY SqueezeNet — Image Classifier in Constrained Environment

Final Poster
GIF
Our model on the ESP32 Microcontroller
Image 1
Image 2

Demo

Final Writeup: https://docs.google.com/document/d/13MCJCozhCvYtDuuXtMQoK_tZaCa2ggX9GeZdFvE_0Qk/edit?usp=sharing

Introduction: SqueezeNet was originally an optimization of AlexNet, achieving the same accuracy with 50x fewer parameters. We plan to take it one step further by re-implementing SqueezeNet with multiple further optimizations including but not limited to network pruning, quantization, weight sharing, and huffman coding. The goal of this is to allow even lower level machines to run image classification models.

Related Work: SqueezeNet has made a big ripple on the Machine Learning community, and this has resulted in many tutorial-esque articles on how exactly it works. A medium article detailed the main features of SqueezeNet including the fire modules, the 1x1 filters, and the general architecture. (https://medium.com/@smallfishbigsea/notes-of-squeezenet-4137d51feef4)

Relevant code list:

Official implementations of SqueezeNet (Under code section) [https://arxiv.org/abs/1602.07360]
Unofficial implementations of SqueezeNet [https://paperswithcode.com/paper/squeezenet-alexnet-level-accuracy-with-50x#code]
Unofficial implementations of Deep Compression [https://paperswithcode.com/paper/deep-compression-compressing-deep-neural#code]

Data: Intel Image Classification and CalTech256 are different datasets that we can train our model on (https://www.kaggle.com/jessicali9530/caltech256).

Methodology: The base implementation of our compression techniques is through SqueezeNet, which is a compressed version of AlexNet. SqueezeNet operates on convolutional neural networks and provides compression through replacing the majority of 3x3 convolution filters with 1x1 filters, decreasing input channels, and delaying downsampling until later layers during training. Additionally, we plan to train our network with other compression techniques such as network pruning and quantization intermittently throughout training. Our goal is to reduce network size without compromising benchmark accuracy, so we want to be able to add on certain compression techniques in varying combinations to determine which combinations or individual techniques are most impactful. Because we will be implementing multiple compression techniques in tandem, our plans will vary on the progressive implementation of each compression technique. Attached below is the original architecture in terms of layers used as well as their dimensions:

[Image1]

[Image2]

Metrics: We plan to first run accuracy and runtime experiments on higher level machines like personal or department machines. Afterwards, we will continue to apply optimizations to try to reduce the number of parameters and the file size of the model while also keeping the model accuracy within 10% of AlexNet. Our benchmark will be based on accuracy of AlexNet run on our chosen dataset.

Base Goal: SqueezeNet + New dataset Target Goal: Quantization + Network Pruning + Deep Compression Stretch Goal: Run SqueezeNet with Tensorflow Lite on Raspberry Pi

Ethics: Q1: What broader societal issues are relevant to your chosen problem space? A1: By reducing runtime and decreasing file size, we make our image classification model more accessible to even those who cannot afford higher level machines. Increasing accessibility for these powerful neural network models is important for fighting against inequality in terms of access to technological power. Furthermore, we can reduce the environmental impact of these models by reducing power consumption directly as a result of fewer parameters and faster runtimes.

However, one issue is that reducing the model sizes will allow for AI models to run on edge-computing devices, making AI-powered surveillance (eg. facial recognition) a reality as computing power is no longer a bottleneck for widespread AI-powered surveillance.

Q2: Who are the major “stakeholders” in this problem, and what are the major consequences of mistakes made by your algorithm? A2: The major “stakeholders” in this problem are the innovators out in the wild who are being prevented from utilizing deep learning algorithms due to the high technology requirements. If our algorithm mistakenly reduces accuracy or increases runtime by a significant amount, it can lead to poorer performance on the tools that our future entrepreneurs use, leading to less growth and interest in deep learning tools.

Division of Labor: Model, basic architecture, and preprocessing will likely be a group effort. We will likely split the responsibilities for optimizations to people in our group as we start to be able to evaluate how much work each optimization is.

Built With

python
tensorflow

Updates

Stephen Lee posted an update — Nov 23, 2020 06:20 PM EST

SqueezeNet was originally an optimization of AlexNet, achieving the same accuracy with 50x fewer parameters. We plan to take it one step further by re-implementing SqueezeNet with multiple further optimizations including but not limited to network pruning, quantization, weight sharing, huffman coding, and segfast. The goal of this is to allow even lower-spec machines to run image classification models with lower power consumption. Our biggest challenges so far have been making sure that we implement our preprocessing and model with our size and runtime benchmarks in mind. For example, in our preprocessing, we struggled to format our data so that we could dynamically preprocess batches of images as we had to consider the size of the database. In the first place, it was difficult to understand the file formats of the data and open them in the correct format. For example, ImageNet uses the .npz file format which requires the entire dataset to be loaded in if using np.load(). Dynamic batching simply added another layer of complexity, in terms of determining which data formats to use and how we should be reading in the data into TensorFlow friendly formats. Although the Keras ImageDataGenerator has a very convenient one line format, after some testing as well as research online about benchmark speed, we realized that using tf.data.Dataset is much more scalable. We ultimately decided that it would be more appropriate to write scalable code, particularly when working with a dataset such as ImageNet. Despite all these difficulties, we were able to successfully write working preprocessing code with transformations that can be batched and streamed in real time as the dataset is needed as well as begin to write the layer architecture and decide on hyperparameters. The code can still be optimized more depending on the use case that we decide on, but the structure and flow of the preprocessing is well defined and good enough for progress on the main project content. Although it is still unclear how much work we have left in the project just from the work we have done so far, it is clear that there will be many more unforeseen nuances and difficulties in our project as we continue on with SqueezeNet and Deep Compression optimizations. We hope to maintain our plan of being flexible with our goals to accommodate these difficulties in the future.

Log in or sign up for Devpost to join the conversation.

Stephen Lee started this project — Nov 12, 2020 01:04 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.