Iris - Eyes for the Blind

Iris is a device that enables the blind and visually impaired to see through their other senses.

Designed to enhance spatial awareness and environmental understanding, Iris consists of a Raspberry Pi 4B, Arduino Uno R3, Ultrasonic Sensor, and a haptic vibration motor. The system captures real-time imagery using a Logitech USB camera and leverages the power of Google’s Gemini-2.5-Flash model to generate concise and easy-to-understand descriptions of the world around the user. These descriptions are then vocalized using LMNT’s cloud-based text-to-speech (TTS) service, creating a human-sounding voice that seamlessly speaks to the user. The system goes a step further by providing haptic feedback using an Arduino-controlled motor, which vibrates at varying intensities based on the distance of objects in the environment, replacing white canes and better protecting the user from obstacles.

We developed Iris by breaking the project down into text-to-speech, serial-based haptic feedback, LLM-driven scene analysis, and image capturing. Our major hurdles were coordinating asynchronous operations across components, adjusting latency, and integrating real-time cloud APIs with hardware that had limited resources. This project taught us how to integrate multimodal outputs into a cohesive system, and optimize AI services for edge devices. In the future, we hope to develop Iris into a wearable assistant with voice-activated commands, offline fallback models, and sophisticated localization, enabling AI-driven awareness to be genuinely portable and individualized.