Inspiration

I've never used anything from Google Cloud, and I know that there are tons of exciting projects made using Google Cloud so I decided to try it out. I did not want to do a machine learning-based program, so I decided on doing image processing using the Computer Vision API. I also have very little experience with image processing, so that served as an inspiration to try and learn it.

What it does

It takes pictures from one folder (memes) and takes a single image (not a meme) and turns the original image into some meme-image amalgamation.

How I built it

First, it sends a request to pull out multiple objects and labels from images. Then it takes those areas (rectangles) and attempts to separate the foreground from the background. The sizes of the areas of the rectangles are compared to the other sizes of the rectangles in the image in order to try and figure out which objects in the image are most-important. For example, with multiple people in an image, the program tries to only replace the forward-most people. I wanted to limit myself and not use external machine learning libraries for the segmentation, so I did it by sampling points on the edges and checking surrounding points. This is definitely not the most effective way of segmenting the background and foreground, but for this project it was sufficient. In order to help line up the memes, the center of the foregrounds of each object were taken. While just taking the center of the rectangles might have been sufficient, I had attempted another way of handling segmentation, and had that code available. Next, the application attempts to remove any overlapping objects. The requests will return rectangles of the objects in the image, however, these objects may overlap. For example, it may detect "person", as well as "pants". This is undesirable because I want to only have a single meme for a single large object. It is supposed to be an amalgamation but not THAT amalgamat-y.

Then, the plan was to take those object detections in the meme images and put them into the foreground-separated original images.

Challenges I ran into

Not only did getting used to using the Google Cloud API take a while, I ran into many difficulties setting it up to a place where I can actually send my requests to their servers. Unfortunately, due to time constraints, I was unable to completely finish the project. The last step of combining the images wasn't able to be done.

Accomplishments that I'm proud of

Learning how to use Google Cloud Manipulating image files

What I learned

How to use Google Cloud Basic Image Processing

What's next for Meme Imagery

In order to make it even more precise, the segmentation can use machine learning in order to accurately separate the foreground and background.

Built With

Share this project:

Updates