Vision

The prototype
It is how our prototype looks like. Camera attached to goggle and audio output to user through earpiece.
GIF
Future Model of our device
RaspberryPI-4

Inspiration

We have seen that billions of dollars have been invested in Metaverse for people's entertainment and development who can see, but what about for the people who cannot see. When someone who is not visually impaired from birth and is passionate about arts and crafts visits parks, museums, art galleries, and so on, they would not be able to learn about the artifacts anymore. Also, the solution's for the visually impaired people which are currently available like the smart blind stick might fail in some conditions like detecting pot holes, going downwards through stairs. So, keeping that in mind, we thought about the problem and hence developed our solution.

What it does

Our solution vision will be an easy-to-wear normal sunglasses with the camera, speaker and the processing unit integrated to it. The visually impaired person will be able to detect objects, person or even obstacle through image and object recognition which will help them to visit Art gallery, museums and similar places easily.

How we built it

We used a computer web camera by attaching it to the sunglasses for making our prototype. In the technical specifications, camera will send the raw data into the OpenCV Parser which will then be processed by our Model. After the data is processed by the NanoDet Plus Model, it will draw the detected objects and prioritize the best object according to the user perspective. It will then recite the detected objects to the user from the speaker integrated to it.

Challenges we ran into

Well we had a few problems while working together with hardware, software and AI Model. We had very much less time to train our own models so we used pre-trained models. But since we had to deploy it on raspberry Pi, we suffered drastically from:

low fps due to heavier model,
low accuracy due to light models
After trying mobileNet, efficientNet, YOLO, and NanoDet Plus, we settled on the later which provided us with about 5 FPS of Single Shot Detection.

Accomplishments that we're proud of

We are glad that we could complete all the features that we had planned for the prototype. Not only that, we added an additional feature of prioritizing the detected objects on the basis of our algorithm. We also divided the FOV into 5 regions [Left-Top, Right-Top, Center, Left-Bottom, Right-Bottom] so that it could better help in navigation for the visually impaired people. We are proud of ourselves that we could identify a crucial problem and we could present solution regarding it.

What we learned

We learned about different models, debugging the solution more [not only the code, but also the idea], and we also learned technical aspects like react, firebase, etc.

What's next for Vision

Our model detects the objects and converts it to audio to be fed on earpiece worn by visually impaired people. Our next steps includes describing the historical statue detected at the museum, and legendary art detected at an art gallery. As for the prototype, we will integrate all the hardware into the device itself and the MVP will be wearable like an ordinary smart-glasses but with smart features. We have attached our prototype design of the MVP in the medias

After successfully implementing the features of the MVP, we will:

Test the product in the real market,
Collect user feedback and make upgrades to the project,
When the product is market ready, approach investors for mass production and marketing.