Inspiration

Deep generative models can be used to generate high-quality samples of some probability distribution after seeing some samples from the distribution. In this project, we implement a recently-proposed modification of a generative adversarial network (GAN) that is aimed at better capturing the data distribution than the vanilla GAN.

What it does

Given random Gaussian noise, our neural network can transform that into a meaningful output image! Our images are size-3 batches of handwritten digits. To represent batches, we stack the output images across the color channel. The original intent was to be able to represent a wide range of celebrity faces and be able to "celebrity-ify" an arbitrary input face. Mathematically, we wanted to be able to project an input face onto the range of a neural network representing celebrity faces.

How we built it

Under the same common terminology used in GAN literature, we implemented a Deep Convolutional Generative Adversarial Network (DCGAN) architecture for our generator and used a slightly modified discriminator. Our discriminator follows the modifications proposed in https://arxiv.org/abs/1712.04086. We used PyTorch 0.4.1 and Python 3 to implement our neural networks.

Challenges we ran into

Debugging neural networks is challenging, as one small architectural error can result in unexpected results throughout the training and evaluation process. Furthermore, hyperparameter search is a little tricky as small changes cascade into huge differences in final results.

Training the networks is also very time consuming even when we had access to a GPU. Thus, it's hard to iterate and improve on past results. Surprisingly, we lucked out and were able to obtain decent samples after only a relatively short while. Conceivably, this process could have taken far longer.

Accomplishments that we're proud of

We got the modification to work on a toy dataset! This will be a useful starting point for us to extend our model to represent and generate more interesting images.

What we learned

Besides the conclusions we gained from encountering the challenges mentioned earlier, we learned how to bring a proposal from a scientific research paper and implement it in practice. This is a useful skill to have, as one can come up with many interesting and viable hacks inspired by research papers.

Our initial goal also turned out to be far too ambitious. We thus learned that it is generally better to set reasonable targets and stack goals as needed instead of setting one large (and likely unattainable) goal.

What's next for Stacked Handwritten Digit Generation

We will extend our model to more interesting datasets and try to model more interesting distributions. Thus, we can hopefully output more pretty samples of more diverse subject matter. This may then bring us closer to our initial goal (mentioned in the "What it does" section).

Built With

Share this project:

Updates