Inspiration
Our friends Joe and Mike, who are visually impaired, inspired SightSense. Witnessing their daily struggles with navigating environments and accessing visual information motivated us to create a comprehensive AI-powered assistant to enhance their independence and quality of life.
What it does
SightSense is an AI vision assistant for the visually impaired. It offers object detection and location guidance, scene description, text reading, and real-time assistance. Users can vocally interact with the app to get information about their surroundings and locate specific objects.
How we built it
We developed SightSense using FastAPI for the backend, integrating various AI models for computer vision (YOLO, MediaPipe), natural language processing (SentenceTransformer, GPT-4 Vision), and OCR (EasyOCR). We combined these technologies to create a seamless experience triggered by voice commands.
Challenges
- Learning Swift for iOS app development
- Integrating multiple AI models efficiently
- Optimizing for real-time performance on mobile devices
- Designing an intuitive voice-based user interface
Accomplishments
- Successfully integrated complex AI models into a mobile app
- Created a user-friendly interface for visually impaired users
- Developed a system for guiding hand movements to objects
What we learned
- Mobile app development with Swift
- AI model integration and optimization
- Accessibility design principles
- Collaborative problem-solving in a hackathon environment
What's next
- Improve accuracy and speed of object detection
- Expand language support for global accessibility
- Develop Android version of the app
- Incorporate user feedback for feature enhancements
- Explore partnerships with organizations for the visually impaired
For the detailed project report visit link

Log in or sign up for Devpost to join the conversation.