Inspiration

-People with disabilities have to overcome many challenges navigating the world in their daily lives. But these were especially enhanced due to the covid-19 pandemic that led to unique lockdown-related disparities experienced by them.

-The challenges faced by the visually impaired in navigating a world especially one that is facing a global health crisis were particularly under-addressed.

-When alone blind people have to essentially memorise their environment and its features including object locations, directions etc. Which makes navigating through life a task requiring a lot of effort. This also potentially open them up to various hazards and may put them in precarious situations.

What it does

General Description

FIX6DSENSE.AI is an AI-Powered real-time virtual assistant that enables the visually impaired to see the world by hearing.

Features

  • Detecting 80 different categorical object types with their confidence level
  • Priority-based detection
  • Distance prediction & scaling algorithm
  • Mobile phone activated camera for various items
  • Speech-activated mode selection
  • Multi-threading usage for asynchronous voice warnings and feedbacks

Modes

  • Aware mode: continuously speak out loud 3 items with the topmost priority
  • Warn mode: only shout out warnings if any item is too close based on priority
  • Search mode: focus on searching a particular item until it is found

How we built it

Day 1 (Friday, 30/09/2022)

-Analysed the social issues related to the topic (AI and Smart Nation) and came up with an appropriate problem statement. -Brainstormed ideas for the possible features and AI solutions and various methods to implement them.

Day 2 (Saturday, 01/10/2022)

-Testing of various models and choosing the most appropriate one for the particular requirements of our project. -Integrated the audio output using synchronised multi threading to reduce latency. -Implemented the live mobile phone camera feed with our ML model and the audio output. -Adding features such as various modes for different purposes.

Day 3 (Sunday, 02/10/2022)

-Worked on the integration of speech recognition for input commands for controlling the various modes and object selection. -Putting everything together, tidying up the code and rigorous testing of the model.

Challenges we ran into

  • Installing the different packages with distinct compatibility versions and integrating all the packages together on different devices
  • Choosing the appropriate lightweight model that can both accurately and efficiently identify the objects
  • Asynchronously implementing text-to-speech without disrupting the main program flow by creating multiple threads
  • Implementation of distance detection simultaneously for different categorical objects
  • Integrating different features implemented by different team members so that they can work well together

Accomplishments that we're proud of

  • Successfully speed up the object detection process that was rather slow and takes a ton of memory initially
  • Multiple modes for different use cases
  • Phone camera integration with live time feedback
  • Integrating various features such as speech recognition, distance algorithm scaling, priority-based detection, etc. into a fully-fledged project
  • The project was overall more attractive and worked much better than we initially expected

What we learned

We learned a lot of things throughout the entire event, both technical knowledge and soft skills.

Some of the interesting technical knowledge we learned are as follows:

  • Using and choosing the appropriate best model, then load the best-trained model into our project
  • Load different file formats such as .proto file, tarfile, pbtext file
  • Implementation of speech recognition and text-to-speech in Python
  • Multi-threading usage and implementation in Python
  • Connecting our mobile device and main program using a server
  • Depth-sensing-related research methods using single-camera, multi-camera, common multi-camera, and laser-pattern

We also learned a lot of skills in terms of working and communicating together in teams with different backgrounds and expertise, being open to new ideas, time management, persistence, and workflow management to integrate the different features and parts contributed by different members.

What's next for FIX6DSENSE.AI

  • Implementation into wearables like opticals
  • Applications in SLAM Robotics (Simultaneous localization and mapping), especially in devices on the edge
  • Integration with better depth-sensing software and hardware
  • Integration into IoT devices
  • Improved models speed and accuracy with low-power AI SOC, with access to real-time feedback
  • Visual aid for the elderly, making them more self-reliant in a society with increasing number of people living alone

Built With

  • ipwebcam
  • numpy
  • python
  • pyttsx3
  • speechrecognition
  • tarfile
  • tensorflow
  • threading
  • urllib
Share this project:

Updates