Inspiration

We wanted to create a tool that makes spaces more accessible and interactive. Combining real-time object detection with fun voice feedback made the experience engaging while addressing practical needs, especially for accessibility.

What it does

Eye Eye, Captain! detects objects in real time and announces them along with their distance from the user. The app incorporates a pirate theme, keeping navigation fun and clear.

How we built it

We developed the front end with React Native and the backend with Python. Using Gemini API, we implemented object detection and ElevenLabs for pirate-themed text-to-speech. We also calculated the approximate distance to each object and set up a server and sockets to connect the front end to the backend.

Challenges we ran into

We faced several challenges, including Gemini API key quota limits, React Expo app connection issues, and Git merge conflicts. After multiple trials and adjustments, we resolved these issues to the best of our abilities!

Accomplishments that we're proud of

We’re proud to have connected the server and app via a socket and integrated computer vision into our project!

What we learned

We learned how to use Gemini API for the purposes of computer vision, more specifically object detection. We also learned to add distance feedback, and text-to-speech using Eleven Labs.

What's next for Eye Eye, Captain!

We aim to make it more accessible by adding support for multiple languages to better serve marginalized communities, following WCAG accessibility guidelines, and applying ability-based design to personalize the experience for every user.

Built With

Share this project:

Updates