dist.ai

Minimum Spanning Tree With Mask Detection - Line Visualization
Mean-Shift Clustering - Group Visualization
Home Page
Log In Page
Sign Up Page
Diagram Side View
Diagram Top View

Inspiration

Covid-19 hit the world hard when it was least expected, and we were all plunged into the confusing and frightening times of the pandemic earlier in 2020. With great innovations in mapping the Covid-19 genome as well as discovering proper health and safety protocols, we are at a stage where gradual reopening of society has already gone through. Our team has friends and family members working in convenience stores, grocery stores and restaurants: we’ve heard the stories of frustration and worry of running small businesses, ensuring safety of all customers and employees, and dealing with noncompliant customers. We’d love for society to return to a safer place, and with dist.ai technologies, we can help companies enforce public health safety protocols with the aid of artificial intelligence!

What it does

Our project does several really cool things. First of all, our project takes a video and detects all the people in it using a neural net with the YOLO architecture. Each detected person is marked by a dot, and we utilized two algorithms to help visualize how these people are physically situated. Specifically, we implemented a minimum spanning tree algorithm to help visualize people that are standing in a line together while we implemented a mean-shift clustering algorithm to help visualize people that are standing in a group or “cluster” together. In addition, we wanted to detect individuals who are not wearing mask, so we used TensorFlow.js to identify and count the people in the video frame who are not wearing a mask.

How we built it

We built a Flask web application that allows users to sign up for either corporate or personal use. We used a Google Cloud Firebase database containing user and company information as well as access to uploaded and stored videos and images. In doing so, users can easily access security footage from their organization and store necessary visualizations into their databases. On the front-end, we used OpenCV.js to draw circles and lines in our canvas.

We also used JavaScript in order to implement two algorithms that helped visualize how people were physically organized in groups--specifically we implemented a minimum spanning tree as well as mean-shift clustering. We also added a slider to adjust the radius for mean shift clustering. Though using a language like Python would have been easier to implement said algorithms, we decided to use JavaScript in order to integrate the algorithms easier with TensorFlow.js.

We were able to track each person’s body through a pre-built YOLO architecture (powered by TensorFlow) which we trained on human body datasets. Through OpenCV, it was able to give us the coordinates of each person on the screen.

We also added a functionality where our TensorFlow.js neural network would detect non-masked people. We first used the Google Cloud Vision API to get the coordinates of all the faces in a video frame. We then passed these images in binary representation through TensorFlow.js. Since the model was being loaded onto the web, we had to train it with a light dataset. Nevertheless, the loss decreased, and we were able to get a satisfying model that did not overfit our data.

We used a function based on the position of the camera to determine the distance in 3D space between points in the image. We accomplished this using the position of the bottom of each detected person (their feet) as we know they are all in the same plane. Then we used the formula as derived in the attached document to determine their relative positions in 3D space.

Challenges we ran into

Our first problem is being able to track each person as it moves across the screen. There were fancy methods such as the Monte Carlo Method, but the runtime was too much. We ended up tracking each person through a screen by hashing each person’s coordinates, calculating the Euclidean distance, calculating the dot product of the pixel values and coming up with a closeness score.

The other challenging aspects were getting the minimum spanning tree and mean-shift clustering algorithms to run efficiently and working well with the testing videos we had. For instance, with the mean-shift clustering, a challenging aspect was making sure there were not duplicate clusters that our algorithm calculated.

One issue is that the Euclidean distance between points doesn’t represent their distance in 3D space. To find the distance in 3D space, we needed to derive a formula using the height, field of view, and initial angle of the camera. The derivation can be found in this document.

Accomplishments that we're proud of

We’re really proud that we could create a platform that could assist companies of all sizes in tackling pandemic challenges regarding social distancing and enforcing masks. From large corporations who will have to ensure safety for hundreds of employees in large buildings to smaller companies that have less bandwidth to enforce social distancing among their customers and employees, our technology gives all companies what they need to safely open up their businesses during a pandemic.

Beyond the pandemic, we’re also proud that our technology can be customized to fit a companies’ needs such as altering the target distances such that theft can be tracked and prevented by our algorithms to fine-tuning distance calculations to include movements such that assault and rape can be alerted to security officials in a timely manner. Machine learning is truly a path to a safer future, and dist.ai is proud to have a diverse set of problems we can tackle.

What we learned

It was our first time working with OpenCV, and we learned how to make masks, use the canny edge detector, and use hair cascades. We also learned a lot of TensorFlow.js. We learned how to create effective models, how to train a model, and how to host it in the website.

What's next for dist.ai

Currently, dist.ai handles distance tracking from uploaded videos, and in order to deepen our impact in next-generation security, we need to be able to run our algorithms on real-time video footage from security cameras.

Built With

css
firebase
flask
google-cloud
google-cloud-vision-api
html
javascript
opencv
python
tensorflow.js

Updates

Brian Kim posted an update — Aug 02, 2023 08:06 AM EDT

It has been a long time... I've recently been discharged from the military. Online Version: https://distai.herokuapp.com/ (Using a lightweight neural network, not as good as YOLO)

Log in or sign up for Devpost to join the conversation.

Mo Kyn started this project — Jan 17, 2021 10:19 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.