Inspiration
With this project, we wanted to help all readers, and create the ability to read for the visually impaired and attention disordered, from any language text, to any language audio.
What it does
The Anigma (anti-enigma) ORC Narrator takes in a specified photo from the user and uses machine learning to identify the text, and speak it in any language also specified by the user.
How we built it
The Anigma ORC Narrator, or AORCN, built in python is a computer vision application, which uses Optical Character Recognition to identify the text in an image of a page. We then translate the text using an API. We send a request to Google's translation service. Finally, the translated page is narrated with gtts.
Challenges we ran into
Some of the challenges that we ran into were translating the inputed text into another language to speak, and building an effective GUI for the user to interact with. To fix the translation problem, we used a python module called gTTS, which helped us to translate English into any other supported language. To address the GUI issue, we used tkinter to create a simple, yet efficient GUI for the user.
Accomplishments that we're proud of
Accomplishments that we are proud of include creating a program, form scratch, that can interpret a picture and read it back in any language, and creating the Machine learning program to interpret the written text.
What we learned
During this hackathon, we learned how to design, build and test an app in a limited amount of time. We also learned how to write text recognition software to read the text from a book, as well as to translate and speak the text in other languages.
What's next for Anigma OCR Narrator
The AOCRN can be upgraded to be used on any device that has a camera and speaker capabilities.
Built With
- ai
- gtts
- gui
- machine-learning
- pygame
- python
- tkinter
- translation
Log in or sign up for Devpost to join the conversation.