
Final Writeup: https://docs.google.com/document/d/13MCJCozhCvYtDuuXtMQoK_tZaCa2ggX9GeZdFvE_0Qk/edit?usp=sharing
Introduction: SqueezeNet was originally an optimization of AlexNet, achieving the same accuracy with 50x fewer parameters. We plan to take it one step further by re-implementing SqueezeNet with multiple further optimizations including but not limited to network pruning, quantization, weight sharing, and huffman coding. The goal of this is to allow even lower level machines to run image classification models.
Related Work: SqueezeNet has made a big ripple on the Machine Learning community, and this has resulted in many tutorial-esque articles on how exactly it works. A medium article detailed the main features of SqueezeNet including the fire modules, the 1x1 filters, and the general architecture. (https://medium.com/@smallfishbigsea/notes-of-squeezenet-4137d51feef4)
Relevant code list:
- Official implementations of SqueezeNet (Under code section) [https://arxiv.org/abs/1602.07360]
- Unofficial implementations of SqueezeNet [https://paperswithcode.com/paper/squeezenet-alexnet-level-accuracy-with-50x#code]
- Unofficial implementations of Deep Compression [https://paperswithcode.com/paper/deep-compression-compressing-deep-neural#code]
Data: Intel Image Classification and CalTech256 are different datasets that we can train our model on (https://www.kaggle.com/jessicali9530/caltech256).
Methodology: The base implementation of our compression techniques is through SqueezeNet, which is a compressed version of AlexNet. SqueezeNet operates on convolutional neural networks and provides compression through replacing the majority of 3x3 convolution filters with 1x1 filters, decreasing input channels, and delaying downsampling until later layers during training. Additionally, we plan to train our network with other compression techniques such as network pruning and quantization intermittently throughout training. Our goal is to reduce network size without compromising benchmark accuracy, so we want to be able to add on certain compression techniques in varying combinations to determine which combinations or individual techniques are most impactful. Because we will be implementing multiple compression techniques in tandem, our plans will vary on the progressive implementation of each compression technique. Attached below is the original architecture in terms of layers used as well as their dimensions:
[Image1]
[Image2]
Metrics: We plan to first run accuracy and runtime experiments on higher level machines like personal or department machines. Afterwards, we will continue to apply optimizations to try to reduce the number of parameters and the file size of the model while also keeping the model accuracy within 10% of AlexNet. Our benchmark will be based on accuracy of AlexNet run on our chosen dataset.
Base Goal: SqueezeNet + New dataset Target Goal: Quantization + Network Pruning + Deep Compression Stretch Goal: Run SqueezeNet with Tensorflow Lite on Raspberry Pi
Ethics: Q1: What broader societal issues are relevant to your chosen problem space? A1: By reducing runtime and decreasing file size, we make our image classification model more accessible to even those who cannot afford higher level machines. Increasing accessibility for these powerful neural network models is important for fighting against inequality in terms of access to technological power. Furthermore, we can reduce the environmental impact of these models by reducing power consumption directly as a result of fewer parameters and faster runtimes.
However, one issue is that reducing the model sizes will allow for AI models to run on edge-computing devices, making AI-powered surveillance (eg. facial recognition) a reality as computing power is no longer a bottleneck for widespread AI-powered surveillance.
Q2: Who are the major “stakeholders” in this problem, and what are the major consequences of mistakes made by your algorithm? A2: The major “stakeholders” in this problem are the innovators out in the wild who are being prevented from utilizing deep learning algorithms due to the high technology requirements. If our algorithm mistakenly reduces accuracy or increases runtime by a significant amount, it can lead to poorer performance on the tools that our future entrepreneurs use, leading to less growth and interest in deep learning tools.
Division of Labor: Model, basic architecture, and preprocessing will likely be a group effort. We will likely split the responsibilities for optimizations to people in our group as we start to be able to evaluate how much work each optimization is.
Built With
- python
- tensorflow


Log in or sign up for Devpost to join the conversation.