SenseWAI

Team picture presented by the senseWAI UI.

Inspiration

We wanted to challenge ourselves using computer vision and various AI tools.

What it does

SenseWAI aims to warn blind pedestrians of incoming objects using the AI powered computer vision capabilities of their phone cameras. The app gives spoken feedback about the user's environment, allowing the user to be more cognizant of potential hazards outside.

How we built it

We use a custom trained YOLOv11 model, a powerful neural network implemented by Ultralytics, detecting crosswalks and relaying information to the blind user can occur within moments. Additionally Google Gemini gives the user a overview of their visible surroundings including warnings for up-close obstacles and changes in elevation in the floor.

Challenges we ran into

Since SenseWAI is a web app, we decided to do all implementation in Javascript to avoid server complications. However, many of our API functions are well documented for server use. Additionally, the version of YOLO that we used is trained and primarily used in Python.

Accomplishments that we're proud of

We are proud of training the YOLO model, achieving functionality on both mobile and desktop browsers, and surviving the last 24 hours.

What we learned

We learned how to train models, strategically engineer prompts, maneuver various levels of text-to-speech compatibility, encode and decode base64 images, and stay awake for a really, really long time.

What's next for SenseWAI

We can look into making an extension of the app that is geared towards potential users who are also hearing impaired using vibrations to immediately warn incoming danger.

Built With

Submitted to

SwampHacks X

Created by

I took on the task of researching YOLO and how we can use it. I then helped set up necessary environments to make training possible. After, I worked on bug fixes.

Sam-DSouza1 DSouza
I worked on implementing Google Gemini AI API and making its responses accessible to the rest of our software. I also implemented multiple features for how feedback was delivered to users.

Chloe Bai
I worked on prompt engineering and utilizing the TTS API as well as various features, debugging, and testing.

norachoukri
I worked on training and implementing both our custom YOLO model and a generic pre-trained one. I also developed the logic for our application.

CyrusKell