Inspiration
We have all participated in some form of teaching, whether it be tutoring, volunteering, helping out during office hours and PSOs as a TA, even simply explaining a problem to our friends. In our various experiences relating concepts in simpler terms, we've noticed that explaining and learning is often easier done visually. However, when attempting to formulate a math problem into an image can be difficult at first, particularly for younger students who are just beginning to learn the entire new language that is mathematics. We wanted to create a platform to help stimulate visual imagination (or, iMaiTHination ;) ) for math problems, to not only provide an entertaining boost for the present math, but to also cultivate the connection between symbols on paper and what we see through our own eyes. Additionally, our web application would be helpful for children whose second language is English and struggle to get past any language barrier when learning math.
What it does
Our application can be split into three parts: reading in data, processing, and generating the image. We allow the user the option to either type in their word problem, or upload a file of the problem. Once the user inputs their data in the desired format, our application will parse the data, and then display an image corresponding to their math word problem.
How we built it
We developed the web application with a Flask backend framework and html, css, and js for the frontend. We used Spacy to perform NLP on each word problem, identifying the subjects corresponding to the numerical quantities, the numerical quantities themselves, the operation involved in the problem, and the subtype of operation (for example, addition problems sometimes calculate the "total" of two values, however other times calculate "more than" of one value). Finally, we used PIL, and openai to create the final image from that we display back to the user.
Challenges we ran into
We were originally going to use a stable diffusion model but ran into various issues in the installation and implementation into our code, which is why we went with the openai API for image generation instead. Additionally, parsing the data was difficult as Spacy does not directly identify subjects and objects, so we created our own algorithm to identify the subjects associated to the numerical quantities. Spacy was often also inaccurate when identifying certain chunks of words, which made generalizing our algorithm to work on all operational math word problems a lot more complex. When using the openai API, we ran into issues with the image generation as it was often inaccurate and would display an image that was far from the prompt we fed it (for example outputting 8 apples when we told it to display 5). Due to this, we used the openai image generator to only generate one of our subjects and pasted that subject that amount of times of the quantity to correct for the inaccuracy.
Accomplishments that we're proud of
We are proud of how generalized our data processing of each math word problem is as it can identify the variables necessary to create our image for any problem inputted. We are also proud of our image generation and a feature we included that makes the images slightly varied even if the problem indicates adding the same subject (i.e. adding apples to more apples). Additionally, we are proud of our UI design and the animated screen.
What we learned
We learned a lot about natural language processing and generative AI, particularly for generating images. We also learned about full stack development.
What's next for iMAiTHination
Currently, our application is catered to generate images for operational math word problems, however we hope to expand it to generate images for more complex math word problems, like algebraic or geometric problems.
Log in or sign up for Devpost to join the conversation.