-
use OpenCV (
cv2) to read and process images by default, but the main ones can also use Pillow (PIL) as an alternative. Some benchmarking comparisons betweencv2andPILcan be found here -
read from image files OR from .lmdb for faster speed. Refer to
IO-speedfor some tips regarding data IO speed.- Note that when preparing the .lmdb database on Windows it is currently required to set
n_workers: 0in the dataloader options, else there can be aPermissionErrordue to multiple processes accesing the image database.
- Note that when preparing the .lmdb database on Windows it is currently required to set
-
images can be downsampled on-the-fly using
matlab-likeimresizefunction. It can add a lot more variety to the training, but the speed is slower than when using other optimized downscaling algorithms like thecv2one. Implemented inimresize.py. For more information about why this is an important consideration, check here -
it is also possible to add different kinds of augmentations to images on the fly during training. More information about the augmentations can be found here
-
base_dataset.pyimplements an base class for datasets. It also includes common functions which are used by the other dataset files. -
single_dataset: includes a dataset class that can load a set of single images specified by the pathdataroot_*: /path/to/data. It only reads single images (LR,LQ,A, etc) in test (inference) phase where there is noGT/Bimage. It can be used for generating CycleGAN results only for one side of the cycle generators. -
aligned_dataset: a dataset class that can load image pairs from image folder or lmdb files and on-the-fly augmentation options. If onlyHR/Bimages are provided or the specific configuration is provided, it will generate the paired images on-the-fly. Used for training on paired images cases (Super-Resolution, Super-Restoration, Pix2pix, etc) training and validation phase. It can work with either one path for each side of the pair (ie,dataroot_A: /path/to/dataAanddataroot_B: /path/to/dataB) or a single image directorydataroot_AB: /path/to/data, which contains image pairs in the form of {A,B}, like the pix2pix original datasets. -
unaligned_dataset.py: a dataset class that can load unaligned/unpaired datasets. It assumes that two directories to host training images from domain Adataroot_A: /path/to/dataAand from domain Bdataroot_B: /path/to/dataBrespectively. -
LRHR_seg_bg_dataset.py: reads HR images, segmentations and generates LR images, category. Used in SFTGAN training and validation phase. -
LRHRPBR_dataset.py: experimental dataset for working with the PBR training model. -
Vid_dataset.py: experimental dataset for loading video datasets in the form of frames in a directory containing one directory for each scene. Based on the structure of the REDS datasets. -
DVD_dataset.py: experimental dataset for loading video datasets, specifically for the interlaced video case. Interlaced frame is expected to be "combed" from the progressive pair. It will read interlaced and progressive frame triplets (pairs of three).
- Prepare the images. You can find the links to download classical SR datasets (including BSD200, T91, General100; Set5, Set14, urban100, BSD100, manga109; historical) or DIV2K dataset from datasets or prepare your own dataset.
SFTGAN is used for a part of outdoor scenes.
- Download OutdoorScene training dataset and OutdoorScene testing dataset from datasets. The training dataset is a little different from that in project page, e.g., image size and format).
- Generate the segmenation probability maps for training and testing dataset using
codes/test_seg.py. - Put the images in a folder named
imgand put the segmentation .pth files in a folder namedbicsegas the following figure shows.
- The same for validation (you can choose some from the test folder) and test folder.
- Similar to SR cases, you will find sample datasets for both paired and unpaired cases in datasets or you can use your own datasets.
- In the case of Pix2pix trainin, the corresponding images in a pair {A,B} must be the same size and have the same filename, e.g.,
/path/to/data/A/train/1.jpgis considered to correspond to/path/to/data/B/train/1.jpgand the size at which the network will use the images to train must be coordinated in the network configuration and theload_sizeoption. - For CycleGAN, you similarly need two directories that contain images from domain
AandB. You should not expect the method to work on just any random combination of input and output datasets (e.g.cats<->keyboards). From experiments, it works better if two datasets share similar visual content. For example,landscape painting<->landscape photographsworks much better thanportrait painting <-> landscape photographs.zebras<->horsesachieves compelling results whilecats<->dogscompletely fails.
More details about the data configuration for image to image translation here
By default random crop and random flip/rotation are used for data augmentation. However, multiple additional on-the-fly options are available. More information about dataset augmnetation can be found here and here.
