FIX6DSENSE.AI

Inspiration

-People with disabilities have to overcome many challenges navigating the world in their daily lives. But these were especially enhanced due to the covid-19 pandemic that led to unique lockdown-related disparities experienced by them.

-The challenges faced by the visually impaired in navigating a world especially one that is facing a global health crisis were particularly under-addressed.

-When alone blind people have to essentially memorise their environment and its features including object locations, directions etc. Which makes navigating through life a task requiring a lot of effort. This also potentially open them up to various hazards and may put them in precarious situations.

What it does

General Description

FIX6DSENSE.AI is an AI-Powered real-time virtual assistant that enables the visually impaired to see the world by hearing.

Features

Detecting 80 different categorical object types with their confidence level
Priority-based detection
Distance prediction & scaling algorithm
Mobile phone activated camera for various items
Speech-activated mode selection
Multi-threading usage for asynchronous voice warnings and feedbacks

Modes

Aware mode: continuously speak out loud 3 items with the topmost priority
Warn mode: only shout out warnings if any item is too close based on priority
Search mode: focus on searching a particular item until it is found

How we built it

Day 1 (Friday, 30/09/2022)

-Analysed the social issues related to the topic (AI and Smart Nation) and came up with an appropriate problem statement. -Brainstormed ideas for the possible features and AI solutions and various methods to implement them.

Day 2 (Saturday, 01/10/2022)

-Testing of various models and choosing the most appropriate one for the particular requirements of our project. -Integrated the audio output using synchronised multi threading to reduce latency. -Implemented the live mobile phone camera feed with our ML model and the audio output. -Adding features such as various modes for different purposes.

Day 3 (Sunday, 02/10/2022)

-Worked on the integration of speech recognition for input commands for controlling the various modes and object selection. -Putting everything together, tidying up the code and rigorous testing of the model.

Challenges we ran into

Installing the different packages with distinct compatibility versions and integrating all the packages together on different devices
Choosing the appropriate lightweight model that can both accurately and efficiently identify the objects
Asynchronously implementing text-to-speech without disrupting the main program flow by creating multiple threads
Implementation of distance detection simultaneously for different categorical objects
Integrating different features implemented by different team members so that they can work well together

Accomplishments that we're proud of

Successfully speed up the object detection process that was rather slow and takes a ton of memory initially
Multiple modes for different use cases
Phone camera integration with live time feedback
Integrating various features such as speech recognition, distance algorithm scaling, priority-based detection, etc. into a fully-fledged project
The project was overall more attractive and worked much better than we initially expected

What we learned

We learned a lot of things throughout the entire event, both technical knowledge and soft skills.

Some of the interesting technical knowledge we learned are as follows:

Using and choosing the appropriate best model, then load the best-trained model into our project
Load different file formats such as .proto file, tarfile, pbtext file
Implementation of speech recognition and text-to-speech in Python
Multi-threading usage and implementation in Python
Connecting our mobile device and main program using a server
Depth-sensing-related research methods using single-camera, multi-camera, common multi-camera, and laser-pattern

We also learned a lot of skills in terms of working and communicating together in teams with different backgrounds and expertise, being open to new ideas, time management, persistence, and workflow management to integrate the different features and parts contributed by different members.

What's next for FIX6DSENSE.AI

Implementation into wearables like opticals
Applications in SLAM Robotics (Simultaneous localization and mapping), especially in devices on the edge
Integration with better depth-sensing software and hardware
Integration into IoT devices
Improved models speed and accuracy with low-power AI SOC, with access to real-time feedback
Visual aid for the elderly, making them more self-reliant in a society with increasing number of people living alone