Variable Hyperparameter Efficient Visual Transformer for Image Inpainting

Installation

Clone this repo.

git clone https://github.com/jose13579/variable-hyperparameter-image-impainting.git
cd variable-hyperparameter-image-impainting/

We build our project based on Pytorch and Python libraries. To train and test our project, we suggest create a Conda environment from the provided YAML, e.g.

conda env create -f environment.yml 
conda activate vhii

Or use the Dockerfile to install conda and prerequisites, e.g.

docker build -t vhii-mage .

If you are having issues to install the environment, we suggest you follow the next steps:

Remove the cupy from the environment.yml
Create the conda env
Manually install the cupy with the correct version:

conda install -c conda-force cupy==7.7.0

Dataset Preparation

We prepare the real and masks datasets.

Preparing Places365 Dataset. The dataset can be downloaded from here. The training set has approximately 1.8 million images from 365 scene categories, where there are at most 5,000 images per category. We adopt the places365-Standard datasets (small images 256 * 256 version) to train and validate our proposed method. The dataset should be arranged in the same directory structure as

datasets
    ｜- places365
        |- data_256
            |- a
                |- airport_terminal
                    |- <image_id>.jpg
                    |- <image_id>.jpg
                |- airplane_cabin
                    |- <image_id>.jpg
                    |- <image_id>.jpg
                |- ...
            |- ...
        |- val_256
            |- <image_id>.jpg
            |- <image_id>.jpg

Preparing CelebA Dataset. The dataset can be downloaded from here. This dataset contains more than 200,000 large-scale facial celebrity images. We adopt 162,770 for training and 19,961 for test. The dataset should be arranged in the same directory structure as

datasets
    ｜- celeba_dataset
        |- train
           |- <image_id>.jpg
           |- <image_id>.jpg
        |- val
           |- <image_id>.jpg
           |- <image_id>.jpg
        |- test
           |- <image_id>.jpg
           |- <image_id>.jpg

Preparing Paris Street View (PSV) Dataset. The dataset can be downloaded from here. The training and test set includes 14,900 and 100 images, respectively. This dataset was collected from the street views of Paris, taking a large number of buildings, and structure information, such as windows and doors. The dataset should be arranged in the same directory structure as

datasets
    ｜- psv_dataset
        |- train
           |- <image_id>.jpg
           |- <image_id>.jpg
        |- test
           |- <image_id>.jpg
           |- <image_id>.jpg

Preparing Mask Dataset. The dataset can be downloaded from here. This mask dataset contains 12,000 irregular masks grouped into six intervals according to the mask area on the total image size, where each interval has 2,000 masks. We employed three intervals (20-30%, 30-40%, and 40-50%) for test. The dataset should be arranged in the same directory structure as

datasets
    ｜- test_mask
        |- mask
            |- testing_mask_dataset
                |- 10-20
                    |- <mask_id>.png
                    |- <mask_id>.png                   
                |- 20-30
                    |- <mask_id>.png
                    |- <mask_id>.png
                |- ...

Or you can use the dataset grouped into directories per interval here.

Results and Models for VHII efficient

Places val

The trained model can be download: places

Model	Seed	Mask set	FID ↓	LPIPS ↓	PSNR ↑	SSIM ↑	Model Size (Mb)	FLOPS (G)	# Params (M)	Config
VHII efficient	0	20-30	1.1783	0.0649	26.4769	0.8922	71	150.272	17.552	config
VHII efficient	0	30-40	2.3969	0.0995	24.1554	0.8368	71	150.272	17.552	config
VHII efficient	0	40-50	4.6187	0.1404	22.3163	0.7751	71	150.272	17.552	config
VHII efficient	42	20-30	1.1806	0.0650	26.4727	0.8922	71	150.272	17.552	config
VHII efficient	42	30-40	2.4146	0.0996	24.1472	0.8366	71	150.272	17.552	config
VHII efficient	42	40-50	4.6141	0.1405	22.3215	0.7750	71	150.272	17.552	config
VHII efficient	123	20-30	1.1767	0.0650	26.4536	0.8920	71	150.272	17.552	config
VHII efficient	123	30-40	2.4125	0.0998	24.1337	0.8366	71	150.272	17.552	config
VHII efficient	123	40-50	4.6877	0.1407	22.3123	0.7749	71	150.272	17.552	config

Celeba test

The trained model can be download: celeba.

Model	Seed	Mask set	FID ↓	LPIPS ↓	PSNR ↑	SSIM ↑	Model Size (Mb)	FLOPS (G)	# Params (M)	Config
VHII efficient	0	20-30	0.7854	0.0330	31.3488	0.9415	71	150.272	17.552	config
VHII efficient	0	30-40	1.3521	0.0490	28.7055	0.9096	71	150.272	17.552	config
VHII efficient	0	40-50	2.2800	0.0686	26.5571	0.8727	71	150.272	17.552	config
VHII efficient	42	20-30	0.7714	0.0329	31.3497	0.9416	71	150.272	17.552	config
VHII efficient	42	30-40	1.3552	0.0491	28.6867	0.9096	71	150.272	17.552	config
VHII efficient	42	40-50	2.2400	0.0684	26.5921	0.8729	71	150.272	17.552	config
VHII efficient	123	20-30	0.7822	0.0330	31.3313	0.9415	71	150.272	17.552	config
VHII efficient	123	30-40	1.3489	0.0491	28.7198	0.9097	71	150.272	17.552	config
VHII efficient	123	40-50	2.2413	0.0685	26.5672	0.8728	71	150.272	17.552	config

PSV test

The trained model can be download: psv.

Model	Seed	Mask set	FID ↓	LPIPS ↓	PSNR ↑	SSIM ↑	Model Size (Mb)	FLOPS (G)	# Params (M)	Config
VHII efficient	0	20-30	24.9343	0.0535	29.9719	0.9146	71	150.272	17.552	config
VHII efficient	0	30-40	35.9012	0.0787	27.6982	0.8719	71	150.272	17.552	config
VHII efficient	0	40-50	46.5952	0.1118	25.7796	0.8209	71	150.272	17.552	config
VHII efficient	42	20-30	26.6362	0.0542	29.8541	0.9137	71	150.272	17.552	config
VHII efficient	42	30-40	35.4199	0.0802	27.6240	0.8699	71	150.272	17.552	config
VHII efficient	42	40-50	47.6322	0.1138	25.7706	0.8187	71	150.272	17.552	config
VHII efficient	123	20-30	26.0129	0.0568	29.4361	0.9110	71	150.272	17.552	config
VHII efficient	123	30-40	35.1132	0.0794	27.7138	0.8741	71	150.272	17.552	config
VHII efficient	123	40-50	47.5585	0.1111	25.9231	0.8231	71	150.272	17.552	config

Results and Models for VHII efficient

Celeba test

The trained models can be download:

Model	Mask set	FID ↓	LPIPS ↓	PSNR ↑	SSIM ↑	Model Size (Mb)	FLOPS (G)	# Params (M)	Config
VHII efficient 256-128-64-32	20-30	0.9393	0.0359	31.0091	0.9391	69	145.54	16.975	config
VHII efficient 256-128-64-32	30-40	1.6504	0.0532	28.4032	0.9066	69	145.54	16.975	config
VHII efficient 256-128-64-32	40-50	2.7938	0.0742	26.2939	0.8694	69	145.54	16.975	config
VHII efficient 128-64-32-16	20-30	1.3365	0.0418	30.2689	0.9323	21	46.412	4.868	config
VHII efficient 128-64-32-16	30-40	2.0768	0.0620	27.7036	0.8971	21	46.412	4.868	config
VHII efficient 128-64-32-16	40-50	4.1076	0.0861	25.6279	0.8572	21	46.412	4.868	config
VHII efficient 64-32-16-8	20-30	2.0768	0.0509	29.4559	0.9208	7.4	20.12	1.655	config
VHII efficient 64-32-16-8	30-40	3.837	0.0751	26.9152	0.8860	7.4	20.12	1.655	config
VHII efficient 64-32-16-8	40-50	6.7535	0.1033	24.8644	0.8432	7.4	20.12	1.655	config

Training New Models

Once the dataset is prepared, new models can be trained with the following commands:

bash run_train.sh --train_config_file

For example:

bash run_train.sh configs/psv_proposal_efficient_128_64_32_16_channels.json

Testing

To test the models

Download the trained models, and save they in trained_models/.
Run the test bash file to evaluate/test the trained model.

bash run_test_dataset.sh --model_name --model_path --seed --gt_dataset_path --mask_dataset_path --output_dataset_path

For example:

bash run_test_dataset.sh "VHII_efficient" "trained_models/celeba/celeba_VHII_efficient/gen_00050.pth" 0 "/data/celeba/celeba_dataset/test/" "/data/pconv/test_mask/20-30/" "test_output_datasets/trained_celeba_VHII_efficient_seed_0/output_images"

The outputs inpainted images are saved at test_output_datasets/.

Metrics

To measure the quantitative results:

cd metrics
bash run_metrics.sh --gt_dataset_path --output_dataset_path

For example:

bash run_metrics.sh "/data/celeba/celeba_dataset/test/" "/config/variable-hyperparameter-image-impainting/test_output_datasets/trained_celeba_VHII_efficient_seed_0"

Image Demo

To inference a single image like this:

bash run_test_image.sh --model_name --model_path --input_path --mask_path --output_path --output_name

For example:

bash run_test_image.sh "VHII_efficient" trained_models/celeba/celeba_VHII_efficient/gen_00050.pth "examples/img/100_000100_gt.png" "examples/mask/100_000100_mask.png" "examples/output" "100_000100_output"

RECOMENDATION

If you are having problems to test our model (e.g celeba), I suggest you follow the next steps:

Use the images from examples/ directory, there you can find the examples that you can use to test the proposed model
Download the celeba model here and put it inside the directory "trained_models/celeba/celeba_VHII_efficient"
Create the docker image

docker build -t vhii-mage .

Create a repository

nvidia-docker run --userns=host -it --rm --name vhii-repository -v /work/data/:/data -v /work/code/:/code vhii-mage bash

Inside the docker repository, you can run the test command

bash run_test_image.sh "VHII_efficient" trained_models/celeba/celeba_VHII_efficient/best_model_celeba.pth "examples/img/100_000100_gt.png" "examples/mask/100_000100_mask.png" "examples/output" "100_000100_output"

Visualization

Citing VHII efficient

@article{Campana2023_Inpainting,
  author=(J.L.F. Campana and L.G.L. Decker and M.R. Souza and H.A. Maia and H. Pedrini}
  title={Variable-Hyperparameter Visual Transformer for Efficient Image Inpainting},
  journal={Computers \& Graphics},
  year={2023}
}

Contact

If you have any questions or suggestions about this paper, feel free to contact me (j209820@dac.unicamp.br).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Variable Hyperparameter Efficient Visual Transformer for Image Inpainting

Installation

Dataset Preparation

Results and Models for VHII efficient

Places val

Celeba test

PSV test

Results and Models for VHII efficient

Celeba test

Training New Models

Testing

Metrics

Image Demo

RECOMENDATION

Visualization

Citing VHII efficient

Contact

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
configs		configs
core		core
examples		examples
images		images
metrics		metrics
model		model
Dockerfile		Dockerfile
README.md		README.md
environment.yml		environment.yml
run_test_dataset.sh		run_test_dataset.sh
run_test_image.sh		run_test_image.sh
run_train.sh		run_train.sh
test_dataset.py		test_dataset.py
test_image.py		test_image.py
train.py		train.py

jose13579/variable-hyperparameter-image-impainting

Folders and files

Latest commit

History

Repository files navigation

Variable Hyperparameter Efficient Visual Transformer for Image Inpainting

Installation

Dataset Preparation

Results and Models for VHII efficient

Places val

Celeba test

PSV test

Results and Models for VHII efficient

Celeba test

Training New Models

Testing

Metrics

Image Demo

RECOMENDATION

Visualization

Citing VHII efficient

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages