Zap-Zap

Our home security system!
The early stages of the ML model that recognized human faces and eyes.
Our ML model recognizing our team member's face after being trained.

Inspiration

We want to protect the president! Protect the National Bank! Protect you! Our product is aiming at protecting homes, offices, or any other places that need to be kept securely. Just like the movies, we will use a camera to detect any unknown people who entered our protected area and use a laser to point to that person - let’s get more accurate protection with laser aiming!

What it does

Our product - Zap-Zap - is a home security system that uses machine learning to alert against people considered intruders, ie. those not saved within the trusted database connected with the homeowner’s account. It works by ringing an alarm as soon as it detects an intruder. It also aims a laser at the intruder’s eyes to temporarily stun them and give those living within the home time to react.

How we built it

We built a system that uses facial recognition to protect people’s homes. This project consisted of three main challenges: Coding/Machine Learning, mathematical modeling, and mechanical engineering.

Coding/Machine Learning

We used Django as the backend for this project. Users are able to create and edit profiles, as well as store pictures of trusted people into the database. The backend is also equipped with retrieving lost passwords by sending an email to the email associated with the user. The database we used to store the user profiles and trusted people was SQLite. We also handled the facial recognition of the trusted people in the backend. In order to implement this, we used OpenCV and trained a machine learning model to recognize people’s faces. Unfortunately, due to hardware limitations, such as the cameras we had available, we were unable to actually implement this facial recognition as the program would run too slowly. However, we were able to successfully train and implement the model, the code for which can be found under backend/zap_zap/video_processing in detection_recognition.py and face_recognition_training.py. Further, we were able to train our model to identify people’s eyes, so we were able to implement shooting a laser into the intruder’s eyes. We conveyed the eye location to the raspberry pi using socket messages. Without the hardware limitations, we would have used the function in the bottom of detection_recognition.py, find_person(), to only shoot the laser at intruders and let trusted people bypass the system by pairing the eye locator function with the facial recognition function. Further, we attempted to make a frontend for this website using React. We communicated from the backend to the frontend using endpoints, but did not implement this fully in the frontend.

Mathematical Modeling

We applied extensive linear algebra to transform 2D images into 3D space. To be specific, in order to aim at a desired point in 3-D space, we needed to have the coordinate of that point with respect to the laser. However, if we only use one phone’s camera, we cannot get an exact location of the point as it is a 2-D picture that gives us a line equation with respect to the camera that passes through the target point. Therefore, in order to get the precise location of a point (aka intruder's eyes), we used two cameras to take get data on the point from two positions. This gives us two line equations that pass through the same target point. We can solve these equations for their intersection in order to get the point's coordinate with respect to the main camera. Next, we transformed it to the laser’s coordinate system and used our derived formula to get the two angles, one for each motor, that control the laser’s position.

Mechanical Engineering

In order to control the laser in 3-D space, we set up two motors: one spinning in the x-y plane, the other spinning in the y-z plane. We used tape and glue to stabilize the laser set up and mount it on our camera box. For ease in calculation, we also made the center of one of the motors the origin as this would reduce the number of variables while calculating formulas. We also made sure that the laser can spin in 3-D space freely without interfering with other equipment.

Challenges we ran into

Code

Regarding Machine Learning, being exposed to OpenCV and endpoints/nested serializers in one day definitely was difficult. There definitely was a huge learning curve of trying to figure out how to make OpenCV work, especially when we added in the second camera. One major challenge we encountered was figuring out the coordinate system. Our program worked by sending the angles the laser would have to move to reach the intruder's eyes to the raspberry pi. In order to do this, we had to transform the coordinates we got from OpenCV into the angles the raspberry pi uses, and did not realize that when it computed the eye coordinates, it did so relative to the face coordinates, and not relative to the whole frame. Thus, we had a hard time getting the laser to move sufficiently. Further, computing facial recognition, especially while streaming video from multiple cameras was so computationally expensive that after we spent a while trying to optimize it, we eventually had to let it go. Hopefully, with better equipment in the future, this will be less of a problem. Minimizing lag in the video processing was also an issue we faced. Finally, we ran out of time to fully connect the frontend and the backend together using endpoints, so we were not able to deploy it.

Math

Applying theoretical math into application was extremely daunting at first. To start off, there numerous uncertainties in real life. For example, when calculating the spinning angle, the motor’s 180 degree rotation angle is not always very accurate due to the equipment error. So during the calculation we needed to consider the small error. Secondly, there are multiple extra factors that we need to consider in real life, like the distance difference between the point of the motor and the laser, the distance error between the cameras, etc. Thirdly, some of the equations we get from theoretical math calculation are very complicated since we are all using symbols instead of numbers for coding function’s use. Therefore, we encountered challenges deriving the equation, calculating it, or coding it into our project. Even after deriving the theoretically "correct" formulas, we still had to do a lot of fine tuning in order to make things actually work.

Mech

One of the hardest parts in Mech was composing all the required materials together while maintaining their accuracy and performance. For example, after setting up two motors with the laser and trying to install it on the camera box, we found that the laser was obstructed from freely moving by part of its mount. Therefore, we cut off a portion of the box to allow the laser to travel through the entirety of its range of motion.

Accomplishments that we're proud of

Code

We are definitely proud of our feature that allows us to detect people, and their eyes. We were also able to recognize the eyes and faces of the trusted people. While this feature is yet to be implemented in the actual app, we were proud to have trained the model, which can be seen in backend/zap_zap/video_processing/trainer/trainer.yml. Further, we were able to successfully move the laser which involved a lot of math in transforming coordinates. We were also happy to code the alarm that rings when it detects a person. Finally, setting up a robust backend was gratifying.

What we learned

This project had many new components for all of us. For starters, none of us are mechanical engineering majors, so building the physical system was challenging. We also had to learn a lot while wiring the cameras, lasers and alarms so that we could connect it to the Raspberry Pi. This project also allowed us to develop our skills in Django, React, JavaScript and Python as we learnt new concepts within them (eg. endpoints). Finally, the machine learning model was completely foreign, but we enjoyed implementing it :)

What's next for Zap-Zap

Optimizing the technology: There definitely is a problem of efficiency with our project. Computing the facial recognition for each frame is very difficult for our machines to handle, and it sometimes requires longer time. Therefore, in the future parallel computing seems to be one of the few options in bettering the performance of our project. Further, investing in better hardware would significantly improve our performance as it would make the videos lag less.
Implementing more features: We had a lot more features that we wanted to implement. For example, we would love to contact emergency services and loved ones quickly, maybe even with a personalized message for each contact. Along with this we want to actually implement the trusted people feature.
Bettering the Interface: Our website is lacking in some aspects such as design and features that we would like to implement in the future.

Built With

Submitted to

LA Hacks 2023

Created by

I contributed 3 Sharetea bobas, 3 lychee Calpicos, 2 sparkling Yerba Mates (which proved to be a mistake because I didn't know Yerba Mates had such high caffeine content), 4 agua frescas (I forget the brand, but the distribution was 2 mango, 1 hibiscus, and 1 pink one (I think it was strawberry?)), 2 matcha Pocky, 2 strawberry Pocky, and 1 bag of watermelon gummies. I also took the initiative in trying out the Magic Mind drink. Oh and I wrote the MicroPython code.

Jason Cheng
Ishita Ghosh
SIRUI KANG
Serena Ong

Updates

Ishita Ghosh started this project — Apr 23, 2023 10:56 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.