ReadToMe

How we built it? We built ReadToMe using the Google Vision API, which we accessed through the Google Cloud Platform. We used the Python programming language to interface with the API and to create the text-to-speech functionality with Py Tesseract.

Challenges we ran into: One of the biggest challenges we ran into was differentiating the cases of when the scanner scanned one page of a book versus two pages of book.

Accomplishments that we're proud of: We are proud of being able to create a tool that can be used by anyone, regardless of their level of vision. We are also proud of being able to get the Google Vision API to work properly.

What we learned: We learned how to use the Google Vision API and how to interface with it using the Python programming language. We also learned how to create the text-to-speech functionality, by downloading the Windows sapi5 voices on the linux based Horoku.

What's next for ReadToMe? In the future, we would like to add more features to ReadToMe, such as the ability to identify colors and to identify faces. We would also like to improve the accuracy of the image-to-text-to-speech functionality. We also look forward to improving the quickness of ReadToMe, where you can walk into a place with earbuds and scan the menu in a quick second, where no one needs to know you are visually impaired.

Built With

css3
flask
github
google-cloud
google-vision-api
html5
node.js
pytesseract
python

Updates

Lane Burgett started this project — Nov 13, 2022 09:11 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.