DYLAN.AI

Inspiration

Through remote work, online learning, and social media, the need for access to digital technology grows ever more important. However, many people with accessibility needs, through not being able to move their mouse or type on their keyboard, fall behind in this digital age. Inspired by these problems, we aimed to create a program to act as an accessibility tool for these people.

What it does

DYLAN.AI Dynamic Yielding Language-based Assistant and Navigator - Artificial Intelligence) is an AI-Driven tool to perform certain actions that would be impossible otherwise for people with accessibility needs. It takes input through voice commands in the microphone and uses Artificial Intelligence-powered technology to parse it into machine-understandable commands.

How we built it

DYLAN.AI uses a variety of different APIs and methods and had a lot of moving parts in the building process. To start, we used the Whisper API created by OpenAI to transcribe user voice commands into text. Next, with this text, we used Cohere's classification to classify the text into various types of commands, determine if it was valid, and determined what the subject of the command was. If the command was a search query, we used Cohere's generate features to generate a search term based on the query and use Google Cloud search APIs to search the web for it, ranking the search results with Cohere rerank. If the command was a keyboard or mouse action, we would use Python keyboard and mouse macros to perform it. Finally, if the command was to open an application, the application would be opened by the tool. We had each functionality of the tool in a different part, and combined them all together at the end, for a full program.

Challenges we ran into

Through the development of this tool, we faced a variety of challenges, in all different aspects of code. Because our team had extremely limited experience with frontend and UI/UX design, our team had to learn Python Tkinter during the hackathon and apply their knowledge immediately. Furthermore, most of us were unfamiliar with AI tools and APIs and weren’t sure of how to approach the issue. Fortunately, Cohere’s APIs were able to provide many of the features we needed and allowed us to build our project.

Accomplishments that we're proud of

A major accomplishment in the making of the product was the creation, implementation, and execution of AI tools and APIs. Because these tools are extremely recent, there was limited documentation, especially for new features like Cohere chat. Being able to figure out and properly use these AI tools provided the foundation to allow us to properly build the rest of the project.

What we learned

We learned a variety of skills, but the most important, in my opinion, was the ability to join multiple completed parts of a project together, even if they were completed independently. Almost every part of the project was created independently, from the voice transcription software to the front-end UI. In the end, we learned how to join all these parts together to create a cohesive, complete project.

What's next for DYLAN.AI

In the future, DYLAN.AI aims to be a tool to be conveniently and effectively used by those with accessibility needs. This means that there are many features or functions that we would like to implement, but could not with the allotted time of the hackathon. We would like to have more user customizability in the tool, allowing for more custom user commands, and a more connected system of user interaction. We also have a goal for a more extensive search command, searching within web pages, rather than just the web, allowing for users to more extensively search. Overall, the tool includes a great amount of functionality, and we hope that we will be able to expand its use and features in the very near future.

Built With

ai
cohere
openai
python
speech
tkinter
whisper

Submitted to

MetHacks 2023

Created by

Worked on language and command processing using Cohere and OpenAI APIs

Edward Wang
I worked on the backend, integrating Whisper's speech recognition, as well as Cohere's text coherence.

Andrew Dai
mississauga high schooler
Worked on front end as well as overall management

Edward Ji
Dylan Dai
CS @ UWATERLOO