Lips Don't Lie: FAST Deep Learning model to read lips

Final project as a part of Technion's IEM 097215 "Deep learning for NLP" & EE 046211 "Deep Learning" 🌠.

Implemented in PyTorch 🔥.

Animation by @rizkiarm .

Description 👄

In this project we combine the BlazeFace algorithm and the transformer architecture and acheive near SOTA performance on the GRID dataset with very fast training and inference.

The Repository 🧭

We provide here a short explaination about the structure of this repository:

videos/[speaker_id] and alignments/[speaker_id] contain the raw data from the GRID dataset; videos and word alignments respectievly.
npy_landmarks and npy_alignments contain the processed videos and alignments. The pre-processing is done automatically by running preprocess.py. The pre-processing mechanisem itself is splitted to the Video.py which pre-processes the videos and Annotation.py which pre-processes the alignments.
dataloader.py contains data loaders for both training and testing as well as a tokenizer which prepares the data for the transformer. Tokenization is done using vocab.txt which contains all the possible tokens, as well as <pad>, <sos> and <eos> tokens.
model.py contains our architecture, divided to the Transformer and an additional Landmarks Neural Net modules.
run.py is the main file of our project. It trains the architecture and then generates predictions on unseen test samples.
config.py containts all the constants and hyper-parameters that are used in the project.
Finally, inference.py is used to make predictions using the pre-trained models.

Running The Project 🏃

Inference 🔎

In order to predict the transcript from some given GRID corpus videos, put them in examples/videos path. Then, just run inference.py. It is possible to change the path/make an inference on a single video by changing the last line of inference.py.

Important: remember to download our pretrained models here, or create them by running run.py

Training 🏋️

In order to train the models from scratch:

Download the desired videos to train on from the GRID corpus which can be found here. Make sure that you download the high quality videos and the corresponding word alignments.
Put the videos in the project directory according to the following path format: videos/[speaker_id]/[video.mpg].

Put the alignments according to the following path format: alignments/[speaker_id]/[alignment.align].
Change the SPEAKERS attribute in the config.py file to a list containing all the speaker ids to train on.
Run preprocess.py. This might take a while.
Run run.py.

Libraries to Install 📚

Before trying to run anything please make sure to install all the packages below.

Library	Command to Run	Minimal Version
`NumPy`	`pip install numpy`	`1.19.5`
`matplotlib`	`pip install matplotlib`	`3.3.4`
`PyTorch`	`pip install torch`	`1.1.10`
`Open CV`	`pip install opencv-python`	`4.5.4`
`DLib`	`pip install dlib`	`19.22.1`
`scikit-learn`	`pip install scikit-learn`	`0.24.2`
`tqdm`	`pip install tqdm`	`4.62.3`

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
alignments/15		alignments/15
npy_alignments/15		npy_alignments/15
npy_landmarks/15		npy_landmarks/15
videos/15		videos/15
Annotation.py		Annotation.py
README.md		README.md
Report.pdf		Report.pdf
Video.py		Video.py
config.py		config.py
dataloader.py		dataloader.py
inference.py		inference.py
model.py		model.py
preprocess.py		preprocess.py
requirements.txt		requirements.txt
run.py		run.py
utils.py		utils.py
vocab.txt		vocab.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Lips Don't Lie: FAST Deep Learning model to read lips

Description 👄

The Repository 🧭

Running The Project 🏃

Inference 🔎

Training 🏋️

Libraries to Install 📚

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

Lips Don't Lie: FAST Deep Learning model to read lips

Description 👄

The Repository 🧭

Running The Project 🏃

Inference 🔎

Training 🏋️

Libraries to Install 📚

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages