Goggles

The faces of both of these have x4 the resolution as the surrounding image!
Low-res
hi-res

Inspiration

Super resolution for general images-- increasing the resolution of images through ML-- is difficult and expensive. From everything from cartoons to animal photography, landscapes to Zoom calls, to find one model to rule them all is years away. Further, as images get larger, inference time grows at least quadratically, leaving current methods ineffective.

This is where Goggles comes in. Because of the pandemic, people have been relying on video communications more and more for 'face-to-face' interaction. Goggles is a solution to increase the quality of these valuable channels, both in enjoyment and resolution.

What it does

Goggles takes advantage of two fairly well established computer vision problems: face detection and facial super-resolution.

Instead of pushing a full image into a super-resolution model, we first detect where the faces are. We can then crop these individual faces and apply an efficient facial super-resolution on the faces alone. Finally, we can stick these images back into the low resolution background.

This gives all the benefit of a higher resolution image, but without the difficulties with creating a general, full-image super-resolution model.

How we built it

The majority of the work involved is building the infrastructure around two models: the Haar Cascades Face Detection algorithm built into OpenCV, and the Face-Super-Resolution model based off of the ESRGAN super-resolution architecture, with

From there, we applied this to connect with the webcam, and ported it to Google Colab to take advantage of their GPUs.

Challenges we ran into

The difficulty of this is making it fast enough for real time use. I was able to cut a lot of the fluff from the original model pre and post processing, but inference time is still a big hurdle.

Accomplishments that we're proud of

It works!

What we learned

You can do a lot of cool ML work without ever training needing to train a model!

What's next for Goggles

The next step would be to try to distill the ESRGAN into something smaller but with similar accuracy. The ultimate goal of this is to be fast enough to work in real-time with one's webcam.

Built With

Updates

Alexander Kristoffersen started this project — Feb 21, 2021 03:30 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.