Skip to content

jose13579/variable-hyperparameter-image-impainting

Repository files navigation

Variable Hyperparameter Efficient Visual Transformer for Image Inpainting

VHII

Installation

Clone this repo.

git clone https://github.com/jose13579/variable-hyperparameter-image-impainting.git
cd variable-hyperparameter-image-impainting/

We build our project based on Pytorch and Python libraries. To train and test our project, we suggest create a Conda environment from the provided YAML, e.g.

conda env create -f environment.yml 
conda activate vhii

Or use the Dockerfile to install conda and prerequisites, e.g.

docker build -t vhii-mage .

If you are having issues to install the environment, we suggest you follow the next steps:

  • Remove the cupy from the environment.yml
  • Create the conda env
  • Manually install the cupy with the correct version:
conda install -c conda-force cupy==7.7.0

Dataset Preparation

We prepare the real and masks datasets.

Preparing Places365 Dataset. The dataset can be downloaded from here. The training set has approximately 1.8 million images from 365 scene categories, where there are at most 5,000 images per category. We adopt the places365-Standard datasets (small images 256 * 256 version) to train and validate our proposed method. The dataset should be arranged in the same directory structure as

datasets
    |- places365
        |- data_256
            |- a
                |- airport_terminal
                    |- <image_id>.jpg
                    |- <image_id>.jpg
                |- airplane_cabin
                    |- <image_id>.jpg
                    |- <image_id>.jpg
                |- ...
            |- ...
        |- val_256
            |- <image_id>.jpg
            |- <image_id>.jpg

Preparing CelebA Dataset. The dataset can be downloaded from here. This dataset contains more than 200,000 large-scale facial celebrity images. We adopt 162,770 for training and 19,961 for test. The dataset should be arranged in the same directory structure as

datasets
    |- celeba_dataset
        |- train
           |- <image_id>.jpg
           |- <image_id>.jpg
        |- val
           |- <image_id>.jpg
           |- <image_id>.jpg
        |- test
           |- <image_id>.jpg
           |- <image_id>.jpg

Preparing Paris Street View (PSV) Dataset. The dataset can be downloaded from here. The training and test set includes 14,900 and 100 images, respectively. This dataset was collected from the street views of Paris, taking a large number of buildings, and structure information, such as windows and doors. The dataset should be arranged in the same directory structure as

datasets
    |- psv_dataset
        |- train
           |- <image_id>.jpg
           |- <image_id>.jpg
        |- test
           |- <image_id>.jpg
           |- <image_id>.jpg

Preparing Mask Dataset. The dataset can be downloaded from here. This mask dataset contains 12,000 irregular masks grouped into six intervals according to the mask area on the total image size, where each interval has 2,000 masks. We employed three intervals (20-30%, 30-40%, and 40-50%) for test. The dataset should be arranged in the same directory structure as

datasets
    |- test_mask
        |- mask
            |- testing_mask_dataset
                |- 10-20
                    |- <mask_id>.png
                    |- <mask_id>.png                   
                |- 20-30
                    |- <mask_id>.png
                    |- <mask_id>.png
                |- ...

Or you can use the dataset grouped into directories per interval here.

Results and Models for VHII efficient

Places val

The trained model can be download: places

Model Seed Mask set FID ↓ LPIPS ↓ PSNR ↑ SSIM ↑ Model Size (Mb) FLOPS (G) # Params (M) Config
VHII efficient 0 20-30 1.1783 0.0649 26.4769 0.8922 71 150.272 17.552 config
VHII efficient 0 30-40 2.3969 0.0995 24.1554 0.8368 71 150.272 17.552 config
VHII efficient 0 40-50 4.6187 0.1404 22.3163 0.7751 71 150.272 17.552 config
VHII efficient 42 20-30 1.1806 0.0650 26.4727 0.8922 71 150.272 17.552 config
VHII efficient 42 30-40 2.4146 0.0996 24.1472 0.8366 71 150.272 17.552 config
VHII efficient 42 40-50 4.6141 0.1405 22.3215 0.7750 71 150.272 17.552 config
VHII efficient 123 20-30 1.1767 0.0650 26.4536 0.8920 71 150.272 17.552 config
VHII efficient 123 30-40 2.4125 0.0998 24.1337 0.8366 71 150.272 17.552 config
VHII efficient 123 40-50 4.6877 0.1407 22.3123 0.7749 71 150.272 17.552 config

Celeba test

The trained model can be download: celeba.

Model Seed Mask set FID ↓ LPIPS ↓ PSNR ↑ SSIM ↑ Model Size (Mb) FLOPS (G) # Params (M) Config
VHII efficient 0 20-30 0.7854 0.0330 31.3488 0.9415 71 150.272 17.552 config
VHII efficient 0 30-40 1.3521 0.0490 28.7055 0.9096 71 150.272 17.552 config
VHII efficient 0 40-50 2.2800 0.0686 26.5571 0.8727 71 150.272 17.552 config
VHII efficient 42 20-30 0.7714 0.0329 31.3497 0.9416 71 150.272 17.552 config
VHII efficient 42 30-40 1.3552 0.0491 28.6867 0.9096 71 150.272 17.552 config
VHII efficient 42 40-50 2.2400 0.0684 26.5921 0.8729 71 150.272 17.552 config
VHII efficient 123 20-30 0.7822 0.0330 31.3313 0.9415 71 150.272 17.552 config
VHII efficient 123 30-40 1.3489 0.0491 28.7198 0.9097 71 150.272 17.552 config
VHII efficient 123 40-50 2.2413 0.0685 26.5672 0.8728 71 150.272 17.552 config

PSV test

The trained model can be download: psv.

Model Seed Mask set FID ↓ LPIPS ↓ PSNR ↑ SSIM ↑ Model Size (Mb) FLOPS (G) # Params (M) Config
VHII efficient 0 20-30 24.9343 0.0535 29.9719 0.9146 71 150.272 17.552 config
VHII efficient 0 30-40 35.9012 0.0787 27.6982 0.8719 71 150.272 17.552 config
VHII efficient 0 40-50 46.5952 0.1118 25.7796 0.8209 71 150.272 17.552 config
VHII efficient 42 20-30 26.6362 0.0542 29.8541 0.9137 71 150.272 17.552 config
VHII efficient 42 30-40 35.4199 0.0802 27.6240 0.8699 71 150.272 17.552 config
VHII efficient 42 40-50 47.6322 0.1138 25.7706 0.8187 71 150.272 17.552 config
VHII efficient 123 20-30 26.0129 0.0568 29.4361 0.9110 71 150.272 17.552 config
VHII efficient 123 30-40 35.1132 0.0794 27.7138 0.8741 71 150.272 17.552 config
VHII efficient 123 40-50 47.5585 0.1111 25.9231 0.8231 71 150.272 17.552 config

Results and Models for VHII efficient

Celeba test

The trained models can be download:

Model Seed Mask set FID ↓ LPIPS ↓ PSNR ↑ SSIM ↑ Model Size (Mb) FLOPS (G) # Params (M) Config
VHII efficient 256-128-64-32 0 20-30 0.9393 0.0359 31.0091 0.9391 69 145.54 16.975 config
VHII efficient 256-128-64-32 0 30-40 1.6504 0.0532 28.4032 0.9066 69 145.54 16.975 config
VHII efficient 256-128-64-32 0 40-50 2.7938 0.0742 26.2939 0.8694 69 145.54 16.975 config
VHII efficient 128-64-32-16 0 20-30 1.3365 0.0418 30.2689 0.9323 21 46.412 4.868 config
VHII efficient 128-64-32-16 0 30-40 2.0768 0.0620 27.7036 0.8971 21 46.412 4.868 config
VHII efficient 128-64-32-16 0 40-50 4.1076 0.0861 25.6279 0.8572 21 46.412 4.868 config
VHII efficient 64-32-16-8 0 20-30 2.0768 0.0509 29.4559 0.9208 7.4 20.12 1.655 config
VHII efficient 64-32-16-8 0 30-40 3.837 0.0751 26.9152 0.8860 7.4 20.12 1.655 config
VHII efficient 64-32-16-8 0 40-50 6.7535 0.1033 24.8644 0.8432 7.4 20.12 1.655 config

Training New Models

Once the dataset is prepared, new models can be trained with the following commands:

bash run_train.sh --train_config_file

For example:

bash run_train.sh configs/psv_proposal_efficient_128_64_32_16_channels.json

Testing

To test the models

  1. Download the trained models, and save they in trained_models/.

  2. Run the test bash file to evaluate/test the trained model.

bash run_test_dataset.sh --model_name --model_path --seed --gt_dataset_path --mask_dataset_path --output_dataset_path 

For example:

bash run_test_dataset.sh "VHII_efficient" "trained_models/celeba/celeba_VHII_efficient/gen_00050.pth" 0 "/data/celeba/celeba_dataset/test/" "/data/pconv/test_mask/20-30/" "test_output_datasets/trained_celeba_VHII_efficient_seed_0/output_images"

The outputs inpainted images are saved at test_output_datasets/.

Metrics

To measure the quantitative results:

cd metrics
bash run_metrics.sh --gt_dataset_path --output_dataset_path

For example:

bash run_metrics.sh "/data/celeba/celeba_dataset/test/" "/config/variable-hyperparameter-image-impainting/test_output_datasets/trained_celeba_VHII_efficient_seed_0"

Image Demo

To inference a single image like this:

bash run_test_image.sh --model_name --model_path --input_path --mask_path --output_path --output_name

For example:

bash run_test_image.sh "VHII_efficient" trained_models/celeba/celeba_VHII_efficient/gen_00050.pth "examples/img/100_000100_gt.png" "examples/mask/100_000100_mask.png" "examples/output" "100_000100_output"

RECOMENDATION

If you are having problems to test our model (e.g celeba), I suggest you follow the next steps:

  • Use the images from examples/ directory, there you can find the examples that you can use to test the proposed model
  • Download the celeba model here and put it inside the directory "trained_models/celeba/celeba_VHII_efficient"
  • Create the docker image
docker build -t vhii-mage .
  • Create a repository
nvidia-docker run --userns=host -it --rm --name vhii-repository -v /work/data/:/data -v /work/code/:/code vhii-mage bash
  • Inside the docker repository, you can run the test command
bash run_test_image.sh "VHII_efficient" trained_models/celeba/celeba_VHII_efficient/best_model_celeba.pth "examples/img/100_000100_gt.png" "examples/mask/100_000100_mask.png" "examples/output" "100_000100_output"

Visualization

face1      Celeba dataset - example 1

face3      Celeba dataset - example 1

Citing VHII efficient

@article{Campana2023_Inpainting,
  author=(J.L.F. Campana and L.G.L. Decker and M.R. Souza and H.A. Maia and H. Pedrini}
  title={Variable-Hyperparameter Visual Transformer for Efficient Image Inpainting},
  journal={Computers \& Graphics},
  year={2023}
}

Contact

If you have any questions or suggestions about this paper, feel free to contact me (j209820@dac.unicamp.br).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published