Inspiration
NLP! GAN! ALL SORTS OF BUZZWORDS! [plus we are nerds from Swarthmore and think the intersection between CS and art is intriguing]
What it does
Robot-handwritten Haikus. We trained a Gan to generate realistic human-looking letters from the EMNIST dataset, using content free grammar to approximate sentence structure and the NLTK python library to randomly generate words at the correct letter count.
How we built it
The characters used for this project are created by a Generative Adversarial Network (GAN). GANs are a state of the art model for generating convincing images from a dataset. The specific dataset we used here was the Extended MNIST dataset (EMNIST). This is a dataset of handwritten digits and characters compiled for research purposes. GANs work by training two neural networks in tandem on the dataset, a Generator network and a Discriminator network. A common metaphor for the behavior of these networks is found in the world of art forgery. The generator network is the forger, constantly trying to improve its ability to mimic the real images. At the same time the Discriminator network is the investigator, training to recognize real images from forgeries. Together both networks improve, playing a zero sum game against one another. When this model was first introduced in 2014 it made a massive impact on the world of generational models. That said, it is still a very active and poorly understood research area, so perfect results are far from expected. The haiku generator is based on a simplified context-free grammar for English. It randomly generates a sentence template consisting of part-of-speech markers. This template is then populated with words corresponding to the correct part-of-speech and bounded by the number of syllables per line. The specific words used are randomly chosen from POS specific word banks, populated by cross-referencing the Princeton Wordnet and CMU Pronouncing Dictionary corpora, which mark POS and stress, respectively. Stress markers are then used to calculate the syllable count of words, to ensure the 5-7-5 structure of traditional haiku.
Challenges we ran into
Tensorflow can be ahem difficult update! We built our program on Swarthmore's private github and had much difficulty sharing it! We are actively trying our best to get you an up to date copy on a public page.
Accomplishments that we're proud of
We got each individual part working and then got them working together! What more could we ask for!
What we learned
Gans still aren't perfect, but they are pretty darn cool, and people will believe almost any series of grammatically feasible words can have meaning
What's next for Written by Gan
a) Make a robotic arm draw the gan-generated letters to make truly 'robotic-handwritten' haikus b) Look into gans on art (who doesn't like some illustration with their poetry) c) Enhance semantic meaning of haikus by using more advanced algorithmic techniques d) Attempt to generate more complicated types of poetry (like sonnets or free verse)
URL for demo site, app store listing, GitHub repo, etc.
(we are going to locally host and tunnel through ngrok so we will give you a real url when you get here!)
Built With
- cmu-pronouncing-dictionary
- css
- emnist
- flask
- gan
- html
- javascript
- machine-learning
- natural-language-processing
- neural-networks
- nltk
- pillow
- princeton-wordnet
- python
- tensorflow
Log in or sign up for Devpost to join the conversation.