Skip to content

cvlab-kaist/APPLE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

61 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🍎APPLE: Attribute-Preserving Pseudo-Labeling for Diffusion-Based Face Swapping

Jiwon Kang1 Β· Yeji Choi1 Β· JoungBin Lee1 Β· Wooseok Jang1 Β· Jinhyeok Choi1
Taekeun Kang2 Β· Yongjae Park2 Β· Myungin Kim2 Β· Seungryong Kim1

1KAIST AI Β  2SAMSUNG

Abstract

Face swapping aims to transfer the identity of a source face onto a target face while preserving target-specific attributes such as pose, expression, lighting, skin tone, and makeup. However, since real ground truth for face swapping is unavailable, achieving both accurate identity transfer and high-quality attribute preservation remains challenging. Recent diffusion-based approaches attempt to improve visual fidelity through conditional inpainting on masked target images, but the masked condition removes crucial appearance cues, resulting in plausible yet misaligned attributes due to the lack of explicit supervision. To address these limitations, we propose APPLE (Attribute-Preserving Pseudo-Labeling), a diffusion-based teacher–student framework that enhances attribute fidelity through attribute-aware pseudo-label supervision. We reformulate face swapping as a conditional deblurring task to more faithfully preserve target-specific attributes such as lighting, skin tone, and makeup. In addition, we introduce an attribute-aware inversion scheme to further improve detailed attribute preservation. Through an elaborate attribute-preserving design for teacher learning, APPLE produces high-quality pseudo triplets that explicitly provide the student with direct face-swapping supervision. Overall, APPLE achieves state-of-the-art performance in terms of attribute preservation and identity transfer, producing more photorealistic and target-faithful results.

For instructions in Korean, please refer to README_kor.md.

1. Project Overview

This document aims to explain the training and inference process of the Diffusion Model (Teacher Model).

2. Installation

2.1. Requirements

  • NVIDIA GPU
  • Anaconda (Conda)

2.2. Clone Repository

git clone https://github.com/your-repo/fluxswap.git
cd fluxswap

Note: <PROJECT_ROOT> refers to the absolute path of this fluxswap directory.

2.3. Conda Environment Setup

This project uses three Conda environments: 3DDFA_env, mediapipe, and faceswap_omini.

1. 3DDFA_env

2. mediapipe

conda env create --file preprocess/mediapipe.yaml

3. faceswap_omini

  • Used for final condition image generation, model training, and inference.
conda env create --file preprocess/faceswap_omini.yaml
conda activate faceswap_omini

# Install mmcv and mmsegmentation
pip install -e preprocess/mmcv
pip install -e preprocess/mmsegmentation

2.4. Dataset Preparation

  • VGGFace2-HQ: The main dataset used for training.
    • This document assumes the dataset is stored at a specific path (e.g., <VGGFACE2_HQ_PATH>).
  • FFHQ: Used for evaluation.
    • The FFHQ dataset consists of src and trg folders, each having a preprocessing structure similar to VGGFace2-HQ. (See 4.1. FFHQ Dataset Inference for detailed structure).

3. Training (Teacher Model)

3.0. VGGFace2-HQ Data Download

  • Use the dataset uploaded to HuggingFace.
  • Decompress the original folder and use the data.

3.1. Data Preprocessing

The VGGFace2-HQ dataset undergoes a total of 3 preprocessing steps.

3.1.1. 3DMM Landmark Extraction

  • Conda Environment: 3DDFA provided Conda
  • File to Modify: <PROJECT_ROOT>/preprocess/3DDFA-V3/demo_from_folder_jiwon_vgg.py
    • line 24: Modify to the VGGFace2-HQ dataset path (<VGGFACE2_HQ_PATH>).
  • Execution:
    • Single GPU:
      cd <PROJECT_ROOT>/preprocess/3DDFA-V3/
      ./run_vgg.sh
    • Multi-GPU:
      cd <PROJECT_ROOT>/preprocess/3DDFA-V3/
      ./run_vgg_multigpu.sh
  • Result: Saved in <VGGFACE2_HQ_PATH>/3dmm/ folder.

3.1.2. Gaze Landmark Extraction

  • Conda Environment: mediapipe
  • File to Modify: <PROJECT_ROOT>/preprocess/MediaPipe_Iris/inference.py
    • line 34, dataset_path: Modify to the VGGFace2-HQ dataset path (<VGGFACE2_HQ_PATH>).
  • Execution:
    • Single GPU:
      cd <PROJECT_ROOT>/preprocess/MediaPipe_Iris/
      ./inference.sh
    • Multi-GPU:
      cd <PROJECT_ROOT>/preprocess/MediaPipe_Iris/
      ./inference_torchrun.sh
  • Result: Saved in <VGGFACE2_HQ_PATH>/iris/ folder.

3.1.3. Final Condition Image Generation

  • Conda Environment: faceswap_omini
  • File to Modify: <PROJECT_ROOT>/preprocess/vgg_preprocess_seg_mask_gaze_multigpu_samsung.py
    • line 73, image_folder_path: Modify to the VGGFace2-HQ dataset path (<VGGFACE2_HQ_PATH>).
  • Execution:
    # Activate faceswap_omini environment
    conda activate faceswap_omini
    # Run script
    python <PROJECT_ROOT>/preprocess/vgg_preprocess_seg_mask_gaze_multigpu_samsung.py
  • Result: Saved in <VGGFACE2_HQ_PATH>/condition_blended_image_blurdownsample8_segGlass_landmark_iris folder.

3.1.4. Dataset Filtering

  • Calculate scores using LAION Aesthetics for VGGFace2-HQ images in advance and use them for data filtering.
  • You can generate the score.json file with <PROJECT_ROOT>/preprocess/vgg_preprocess_score_multigpu.py.
  • An example file used is <PROJECT_ROOT>/preprocess/score.json.

3.2. Model Training

  • Conda Environment: faceswap_omini
  • Config File: <PROJECT_ROOT>/train/config/baseline_vgg_0.35.yaml
    • netarc_path: Modify to the Arc2Face model path to be used.
    • dataset_path: Modify to the VGGFace2-HQ dataset path (<VGGFACE2_HQ_PATH>).
  • Execution:
    cd <PROJECT_ROOT>/train/script
    ./baseline_vgg.sh

4. Inference (Teacher Model)

  • Conda Environment: faceswap_omini
  • Checkpoint Used (Example): <PROJECT_ROOT>/checkpoints/teacher

4.1. FFHQ Dataset Inference

Example of inference on the FFHQ evaluation dataset.

  • base_path: Project root path (<PROJECT_ROOT>)
  • ffhq_base_path: Preprocessed FFHQ dataset path. Assumes the following structure:
    <FFHQ_BASE_PATH>/
    β”œβ”€β”€ src
    β”‚   β”œβ”€β”€ 3dmm
    β”‚   β”œβ”€β”€ condition_...
    β”‚   β”œβ”€β”€ ...
    β”‚   └── 000000.jpg
    └── trg
        β”œβ”€β”€ 3dmm
        β”œβ”€β”€ condition_...
        β”‚   ...
        └── 000000.jpg
    
  • id_guidance_scale: Higher settings increase ID identity reflection but may decrease attribute preservation. (Minimum value: 1.0)

Without Inversion

CUDA_VISIBLE_DEVICES=0,1,2 torchrun --standalone --nproc_per_node=3 pulid_omini_inference_ffhq_args_multigpu.py \
    --base_path <PROJECT_ROOT> \
    --ffhq_base_path <FFHQ_BASE_PATH> \
    --checkpoint_path <PROJECT_ROOT>/checkpoints/teacher \
    --guidance_scale 1.0 \
    --image_guidance_scale 1.0 \
    --id_guidance_scale 1.0 \
    --condition_type 'blur_landmark_iris'

With Inversion

CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --standalone --nproc_per_node=4 pulid_omini_inference_ffhq_inversion_args_multigpu.py \
    --base_path <PROJECT_ROOT> \
    --ffhq_base_path <FFHQ_BASE_PATH> \
    --checkpoint_path <PROJECT_ROOT>/checkpoints/teacher \
    --guidance_scale 1.0 \
    --image_guidance_scale 1.0 \
    --id_guidance_scale 1.0 \
    --condition_type 'blur_landmark_iris'

4.2. Pseudo Dataset Generation

Generate a pseudo dataset based on VGGFace2-HQ. The VGGFace2-HQ dataset must be preprocessed.

  • Execution:
    • Run the <PROJECT_ROOT>/pulid_omini_dataset_gen_fluxpseudovgg_multigpu.sh shell script.
    • line 34, lora_file_path: You can set the checkpoint path to be used within the script.

5. Inference (Student Model)

  • Conda Environment: faceswap_omini
  • Checkpoint Used (Example): <PROJECT_ROOT>/checkpoints/student
CUDA_VISIBLE_DEVICES=0,1,2 torchrun --standalone --nproc_per_node=3 pulid_omini_inference_ffhq_args_multigpu.py \
    --base_path <PROJECT_ROOT> \
    --ffhq_base_path <FFHQ_BASE_PATH> \
    --checkpoint_path <PROJECT_ROOT>/checkpoints/student \
    --guidance_scale 1.0 \
    --image_guidance_scale 1.0 \
    --id_guidance_scale 1.0 \

About

Official implementation of "🍎APPLE: Attribute-Preserving Pseudo-Labeling for Diffusion-Based Face Swapping"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors