reducible

what is `reducible`❓

Reducible compresses the parameters of a neural network using a lossy compression technique called rank-k approximation.

our inspiration 🤩

Our idea came from when the team was thinking about how awesome JPEG is. It is able to take your images, throw out a ton of the data, and yet reconstruct the image in a way that makes it seem nearly identical to the human eye. For this hackathon, we thought about how we could use this idea to compress not just images, but compress neural networks. The sheer size of many deep learning models can be impractical for everyday use on devices with limited size and memory. However, being able to bring nearly accurate models but with half the space can be game changing for applications in mobile and edge computing, IoT devices, and the design of future models in general.

how we built it 🛠️

We created a custom TensorFlow layer that acts similarly to a Dense layer but instead stores a rank-k approximation of the matrix.

challenges 😰

Tensorflow had issues running on our Macs, so we had to use online compute power. We also had to make a custom file format for loading our compressed model because TensorFlow would not serialize our objects correctly.

results 💪

Our network has ~66% fewer parameters while only suffering from a ~0.45% decrease in accuracy!

what we learned 🤓

We learned a great deal about the math behind SVD (singular value decomposition), as well as how to work with the TensorFlow backend API and make custom Layers. We also learned about object serialization and common space optimizations.

what's next 🌎

Our current optimization algorithm assumes that a higher rank leads to a better approximation, which from our experimentation is mostly true, but isn't necessarily always true. This ends up being an integer programming problem, which is NP hard, but we want to explore possible efficient algorithms.

We also want to optimize our custom filetype with non-lossy serialization. Currently, our model has a predicted reduction of ~66%, while having an actual reduction of ~16%. This is because, although our compressed matrices have ~66% fewer parameters, we do not have the same serialization optimizations that TensorFlow includes that allows them to bring the filesize down so dramatically despite being unoptimized.