Our accuracy rate is actually 89.5% not 84.21% :D
Inspiration
We were inspired to do this project after seeing the cool computer vision opportunities present. In addition we thought that the receipt transactions were a novel data science projects with real world applications.
What it does
Our project takes in receipt images and OCR data to accurately match it with user provided data about said receipts. It does accomplishes this first by thoroughly cleaning the user data. Unfortunately, the provided has many imperfections but by writing a clean_data script, we managed to solve this issue. Next, to match all our receipts to our user data, we used computer vision to isolate the text within our receipt images and transform them into usable OCRs alongside our provided OCR data. Using our cleaned data and complete OCR data, we then worked on transforming this data into machine usable code. To do this we used lavenshtein distances to measure the difference in text between our OCR data and our user data. By minimizing for address and vendor name, we managed to find a batch of potential user ids that could match with our receipts. In addition, we further filtered our potential user ids by filtering for the date and total price of the receipts. Finally, using this data, we used machine learning to choose which one of our potential user ids to match with our receipts. Using gradient descent we were able to find the optimal weights for the Levenshtein distances of price, date, and address, helping us reach an accuracy rate of 84.21% for our test case.
How we built it
We built the project mostly by splitting into 3 separate subproblems. Ian was in charge of computer vision and tuning our hyper parameters, Ben was in charge of cleaning the data and creating our final video project, and I was in charge of data parsing and building the gradient descent model. Through this division of labor, we were able to finish the massive projects within 2 days time.
Challenges we ran into
The primary challenge we ran into was the input data and ocr data being messy. This lead to multiple issues; however, we were able to solve this problem through a myriad of techniques such as data cleaning and Levenshtein distances.
Accomplishments that we're proud of
We're very proud of our computer vision results. Using open cv, we were able to read the text of a receipt with a similar quality of that of the provided OCRs. We were also proud of our efficiency. Our initial algorithm to parse through the reciepts would have taken an estimated 48hrs, but through multiprocessing and by improving our algorithm, we managed to shorten it to just 1.5 minutes. However, we're most proud of our video. Using python, we were able to create an animated, entertaining, yet informative video like 3 blue 1 brown.
What we learned
We learned a lot in this datathon such as data processing techniques, computer vision, and animation. However, the most valuable thing we learned this weekend was teamwork. None of us alone could have accomplished a fraction of the work needed to accomplish this project, but by working together we were able to create a project that we are very, very proud of.
What's next for Verifying Receipt Transactions
Hopefully we will be able to win an ipod (or ipad). Regardless, we want to ideally get the accuracy higher. We would like to use gradient descent to control for our Levenshtein thresholds to improve our accuracy. In addition, we would like to use more computer vision in our project; we wanted to find where the most common receipt spots for your total would be. And finally we would use the things we learned in this datathon for our future data science endeavors
Log in or sign up for Devpost to join the conversation.