Inspiration

Our friends Joe and Mike, who are visually impaired, inspired SightSense. Witnessing their daily struggles with navigating environments and accessing visual information motivated us to create a comprehensive AI-powered assistant to enhance their independence and quality of life.

What it does

SightSense is an AI vision assistant for the visually impaired. It offers object detection and location guidance, scene description, text reading, and real-time assistance. Users can vocally interact with the app to get information about their surroundings and locate specific objects.

How we built it

We developed SightSense using FastAPI for the backend, integrating various AI models for computer vision (YOLO, MediaPipe), natural language processing (SentenceTransformer, GPT-4 Vision), and OCR (EasyOCR). We combined these technologies to create a seamless experience triggered by voice commands.

Challenges

  • Learning Swift for iOS app development
  • Integrating multiple AI models efficiently
  • Optimizing for real-time performance on mobile devices
  • Designing an intuitive voice-based user interface

Accomplishments

  • Successfully integrated complex AI models into a mobile app
  • Created a user-friendly interface for visually impaired users
  • Developed a system for guiding hand movements to objects

What we learned

  • Mobile app development with Swift
  • AI model integration and optimization
  • Accessibility design principles
  • Collaborative problem-solving in a hackathon environment

What's next

  • Improve accuracy and speed of object detection
  • Expand language support for global accessibility
  • Develop Android version of the app
  • Incorporate user feedback for feature enhancements
  • Explore partnerships with organizations for the visually impaired

For the detailed project report visit link

Built With

Share this project:

Updates