CloudVision

Inspiration

Nearly 1.2 billion people have some form of visual impairment worldwide. Covid19 has made life even more challenging in terms of social distancing and navigating surroundings with little to no help. Recent advances in computer vision and deep learning have opened pathways to counter this problem. We wanted to build a web app for our ShellHacks 2021 project where we could help the visually impaired maintain social distance while guiding them to their destination. CloudVision uses computer vision to help detect objects, people in the surrounding and provide information for each detected object e.g., number of objects, each identified object and its distance and direction relative to the person using. Similarly, it sends a sound or haptic alert in case a person is detected within the six feet vicinity and provides directions where it’s safe to move.

How we built it

Front End The front end was developed by frank using react. In the front end, we create multiple cards that list out different functionalities of our app. Each functionality is read out loud for accessibility. Once a card is selected, a REST call is made to the back end with attached video and audio command. Object Detection Laura, Paulina, and Usman worked on the Object detection machine learning backend. The object detection and tracking is performed using efficieintdet api and Cloud Video Intelligence by Google Cloud. The api detects around 100 objects in each frame and outputs objects with confidence over 60%. The distance is measured using the orientation and the coordinates of the objects.

Whats next?

Road recognition for navigation using the google maps API. Host the project in a firebase instance. Improve UI experience with user feedback. Decrease the latency between frames recognition, it is slow.

Built With

figma
flask
googleassistant
javascript
keras
postman
python
react
tensorflow
voicerecognition

Submitted to

ShellHacks 2021
- Winner Best Use of Google Cloud by Google
- Winner Health Access and Delivery, National Security or Mis/Dis Information by Mitre

Created by

I worked on the machine learning side of the project. Integrated Object recognition with EfficientDet and text Recognition with Google video intelligence.

Laura Figueroa
I created the wireframe for the front end UI. Also created an API for the backend using Flask to communicate the machine learning models output with the front end UI

Paulina Acosta
I worked on the Cloud Video Intelligence and Speech to Text recognition Google Cloud APIs.

Usman Tariq
I created the mobile frontend with ReactJS to incorporate user login and profile pages, video camera support, and a user interface that’s easy to navigate.

Francisco m