Inspiration:
A conversation about Tesla's parking sensors eventually led to debates on ultrasonic waves and how these sensors can be used to detect objects in the way of vehicles. The idea expanded into the potential for a device to help visually impaired individuals detect obstructions in a similar fashion to parking sensors on vehicles.
What it does:
Approximately $30 worth of materials, assembled into a wearable assistive device, specifically designed for visually impaired users. It uses a chest-mounted camera to stream live video to a YOLO AI model, which then detects and identifies objects in real time. Additionally, it determines the obstruction's position relative to the user and even vibrates in the corresponding direction. The user can also use voice commands to ask questions about their environment or any obstruction items.
How we built it:
The Echolocation Vest is built around two ESP43 microcontrollers and a Python laptop backend that runs everything. The ESP32-CAM (mounted near the middle of the chest) streams live video over WiFi to a Python script running YOLOv8 nano, which then detects & identifies objects, determines their direction relative to the user (which works by dividing the frame into left, center, and right thirds). This direction data is sent over a persistent WebSocket connection to a second ESP32 (this one controlling three vibration motors, one over each pec and one at the sternum), wired through 3 2N3904 transistors and 3 flyback diodes (for safe motor switches). Finally, a separate background thread handles the voice inputs: the user says the keyphrase "Hey Echo...", and then can ask questions that will be answered by a GPT-4o-generated response.
What is Echo?:
Echo is the intelligent core of the Echolocation Vest, our real-time AI awareness engine that transforms any raw visual data (from the ESP32 camera) into natural and helpful conversation. Powered by GPT-4o and supplied with a live feed of the mounted ESP camera, Echo works hard to understand the user's surroundings and the relative position of any obstructions or potentially dangerous items.
Challenges we ran into:
Diverging branches, ESP ElevenLabs, switching from Gemini to OpenAI, hardware limitations (wires popping out of place), struggles with connecting to the WebSocket, LLM response latency, and camera brightness/quality/latency issues.
Accomplishments that we're proud of:
Eventually fixing our latency on the camera, implementing a sense of abstraction from interactions between the USER and MVP (minimum viable prototype), neat wiring, and overall impressive integration of hardware/software/firmware, dramatically decreasing the latency of responsive LLM (Echo) response generation.
What we learned
Half of our group are first-time hackathon attenders, getting used to the format and project specifications was a challenge in itself. Additionally, we realized that to actually control the ESP32-motors, it would have to be done from one person's computer only (the target IP).
What's next for Echo Vest
Potentially replacing the Python laptop backend with an easy-to-use mobile application. Dramatically increasing the vests' application, accessibility, and overall efficiency for users.
Log in or sign up for Devpost to join the conversation.