Inspiration
We wanted to challenge ourselves using computer vision and various AI tools.
What it does
SenseWAI aims to warn blind pedestrians of incoming objects using the AI powered computer vision capabilities of their phone cameras. The app gives spoken feedback about the user's environment, allowing the user to be more cognizant of potential hazards outside.
How we built it
We use a custom trained YOLOv11 model, a powerful neural network implemented by Ultralytics, detecting crosswalks and relaying information to the blind user can occur within moments. Additionally Google Gemini gives the user a overview of their visible surroundings including warnings for up-close obstacles and changes in elevation in the floor.
Challenges we ran into
Since SenseWAI is a web app, we decided to do all implementation in Javascript to avoid server complications. However, many of our API functions are well documented for server use. Additionally, the version of YOLO that we used is trained and primarily used in Python.
Accomplishments that we're proud of
We are proud of training the YOLO model, achieving functionality on both mobile and desktop browsers, and surviving the last 24 hours.
What we learned
We learned how to train models, strategically engineer prompts, maneuver various levels of text-to-speech compatibility, encode and decode base64 images, and stay awake for a really, really long time.
What's next for SenseWAI
We can look into making an extension of the app that is geared towards potential users who are also hearing impaired using vibrations to immediately warn incoming danger.
Built With
- gemini
- google-cloud
- javascript
- tts
- vite
- yolo
Log in or sign up for Devpost to join the conversation.