A project in Image Segmentation and Inpainting. Extension of LaMa Paper, LaMa Github.
In this project we aim to augment the LaMa project. In the original paper the input consists of a pair of high resolution image and a binary mask. We propose to auto-generate the input masks using a segmentation neural network and thus making the task fully automated.
In our Project we use DeepLabV3 model to segment the images. The pre-trained model has been trained on a subset of COCO train2017, on the 20 categories that are present in the Pascal VOC dataset.
Written by George Pisha and Men Yevgeniy.
Clone the Repo:
git clone https://github.com/yevgm/Combining-segmentation-and-Inpainting
cd Combining-segmentation-and-InpaintingThe following links will download the data folders:
- Test Dataset - Contains test dataset for three classes (dog, bus, person) and their manual segmentaton masks, automatic segmentation masks and the output
- LaMa fourier model - Pretrained LaMa model (same as the original)
-
Setup conda
conda env create -f env.yml conda activate seg_inpaint pip install pyyaml==5.4.1
This will create a working environment named 'seg_inpaint'
-
Test images and pretrained models can be downloaded via the links above
- To run the inpainting pipeline run command 2
- -c to choose the class integer. CHOSEN_CLASS can be chosen via command 1
- -i to provide full path to the images
- ./test_images is the path of model input
- --lama-model-path is the lama-fourier pretrained model path
- --lama-model-name is the filename of the model
- If you wish to run the video temporal inconsistency pipeline:
- You must use the config './src/video_seg/video_config.yaml'
- The model will be trained on the video frames that must be present in 'output' directory in the repository root
- Just run command (3)
python ./main.py -a print_cls
python ./main.py -a inpaint -c CHOSEN_CLASS -i $(pwd)/test_images --lama-model-path $(pwd)/lama-fourier --lama-model-name best.ckpt
export PYTHONPATH=.
python ./src/video_seg/train.py
After running the inpainting command (2), two directories will be created:
- input - which will include the original images alongside their semantic segmentation mask
- output - which will include the inpainted images
After running the temporal inconsistency pipeline (3):
- results - will contain the output video
- logs - will include training logs
- a model checkpoint will be saved
To calculate the numerical results on the whole dataset run:
- Download the test images from the link above
- Run the following command
export PYTHONPATH=.
python src/segmentation_comparison.py -t ../../test_data_comparisonWhich will calculate the LPIPS distance for every class between the original image and the inpainted image, for semantic segmentation mode (auto) and manual mask generation.
| Class | Manual (LPIPS) | Segmentation (LPIPS) |
|---|---|---|
| dog | 0.1261 | 0.1314 |
| bus | 0.1013 | 0.1018 |
| person | 0.1626 | 0.1584 |
We also extend LaMa to video, by building a pipeline to feed videos. Additionally, to improve temporal consistency we also add an optional training step. This step is using internal learning and the Deep Image Prior concept to create video temporal consistency.
python ./main.py -a inpaint -c CHOSEN_CLASS -i $(pwd)/test_images --lama-model-path $(pwd)/lama-fourier --lama-model-name best.ckptThen run
python ./src/video_seg/train.py -i ./video_imgs This project is licensed under the MIT License - see the LICENSE.md file for details



