Ollie

Inspiration

Ollie was inspired by popular translation apps such as Google Translate and Apple Translate. Our idea expands on the basic premise of translating text from an input by including an image scanner and unique summary feature using Artificial Intelligence (AI).

What it does

This application takes large amounts of text in any language and translates it into a summary of the chosen language. It can detect, translate, and summarize a significant quantity of text from an image using Cohere as the Application Program Interface (API). It can also be used to type a selection of text and summarize it in many desired languages. Summarization can help users identify and understand the main points without having to read through large amounts of text.

How we built it

Ollie was designed and conceptualized on Figma, then coded using HTML/CSS, JavaScript and Cohere as an API. Our tech stack is simply React for the front end and node.js for the back end. To achieve our translation and summarization we are chaining a pipeline of API requests from different providers to get a seamless unique user experience. First, to get the text, we used either Google Cloud’s vision API for image-to-text detection or a textfield for direct text input. Next, we took that text and translated it to the desired language using Google Cloud’s translation API for text language translations. Finally, we utilized the Cohere AI Javascript SDK to request their generated API for text summarizations and then presented that text to the user in various levels of verbosity.

Challenges we ran into

Initially, conceptualization was a challenge that we ran into and overcame after lengthy discussions about possible project ideas. Once we settled on creating a translator application, the process became smoother in terms of design and ideation. Moreover, figuring out the user journey within the app was difficult because of the many features and potential routes that could be implemented.

Coding also proved to be a challenge specifically regarding image encoding in the image to text conversion. The image size was too large and the quality was too high resulting in not being able to pass the image url through web requests because its url contained too many characters. We ended up converting the image to binary via passing the image through content type multipart/form-data as well as reducing the image resolution to the image and changing its type to jpeg.

Accomplishments that we're proud of

We are proud of conceptualizing, designing, and building our mobile application within the 36-hour time constraint. Furthermore, we are proud of our collaboration skills and taking our individual expertise to distribute the workload. The technical challenge of utilizing multiple API’s we’ve never used before and integrating them all together seamlessly was enormous and so we are proud of what we were technically able to accomplish as well.

What we learned

We learned how to convert text based images into text using Google Cloud’s vision API, then translate that text to and from desired languages, and finally summarize that text using Cohere’s summarize API. We had to read and understand the docs for each of the API’s as we had no experience using them before as well as integrate these API’s together in a seamless way. Our team also learned a lot about collaboration when working with such a fast deadline forcing us to prioritize the Minimum Viable Product (MVP) and not waste time on unimportant tasks. We also quickly identified everyone's unique strengths and delegated tasks to those who would be most efficient in completing them.