Inspiration
During the times of COVID, some of our team became interested in Yoga and meditation. However, we know there are hundreds if not thousands of yoga poses. So we wanted to develop an application that is able to detect different yoga poses. We also wanted to make exercising fun. This is a great way to exercise as it almost "gameifies" it so it doesn't feel like a chore. This app is a proof of concept and can be altered so it can be easily applied to other forms of exercise. For instance, it can be used to detect a number of pushups or situps. Also with a few modifications to the model, it can be used to help players with their forms in sports such as baseball (i.e. throwing a pitch) or basketball (shooting a three-pointer or a free throw).
What it does
This app is able to take pictures from the user's webcam performing various different yoga poses and allows the user to submit them to an API that shows the user the name of the yoga pose they performed in the picture.
How we built it
To build this application, we used the MediaPipe library by Google to take images captured from the user's webcam and find different landmarks on the joints of the user's body. This is commonly done in a broader set of Computer Vision problems and is known as "KeyPoint Detection". We firstly initialized our model with various methods in the MediaPipe pose detection library. We converted the image into BGR from RGB so it could be used in the model. Then, the image is processed and we wrote a function to check if the processed image contains the landmarks (i.e. different joints and body parts). If the image contains the landmarks, we use the library to draw the pose landmarks on the passed image. To perform the pose classification, we created a function that is able to take the landmarks that we stored from the image, and perfrom checks on the different angles made by the joints. We used different angle heuristics to classify each pose. We then used this data and implemented a small API using FastAPI to create an HTTP Post method that takes an image and returns a JSON object containing one value called "Pose". We then finally connected our results from the API that we made with a React.js frontend to allow the user to take photos from their webcam and upload them to the API. In the front end, we used the "react-webcam" library to get access to the user's webcam and also capture the image in a 64-bit encoded string.
Challenges we ran into
The biggest challenge we ran into was with the FAST API. It was very difficult for us to convert the 64-bit encoded image string into the normal format of a .jpeg or .png image. However, near the end, we were able to develop a method that allowed us to convert the image into the correct format so it could be passed into the request body of the POST request.
Accomplishments that we're proud of
We are proud of using a branch of AI (Computer Vision) for the first time to develop a working application in time.
What we learned
We learned a lot about Keypoint detection, Computer Vision as well as making an Application Programming Interface. We learned about various different libraries such as MediaPipe, Open CV, and react-webcam which made our project possible.
What's next for FormPerfect AI
In the future, we would like to implement this using a video feed so the pose detection can be done in real-time so the user does not have to click on "Capture" to take a photo. Also, we would like to implement a leaderboard system and a percentage to guide the user on how accurate their yoga pose is.
Built With
- css
- fastapi
- html
- javascript
- mediapipe
- python
- react.js
Log in or sign up for Devpost to join the conversation.