Skip to content

muyuWang/HHNeRF

Repository files navigation

High-fidelity and High-efficiency Talking Portrait Synthesis with Detail-aware Neural Radiance Fields

This repository provides a PyTorch implementation for the paper: High-fidelity and High-efficiency Talking Portrait Synthesis with Detail-aware Neural Radiance Fields. (on TVCG2024, Manuscript DOI: 10.1109/TVCG.2024.3488960)

A self-driven generated video of our method: here

A cross-driven generated video of our method: here

Installation

Tested on Ubuntu 22.04, Pytorch 2.0.1 and CUDA 11.6.

git clone https://github.com/muyuWang/HHNeRF.git
cd HHNeRF

Install dependency

pip install -r requirements.txt

Data pre-processing

Our data preprocessing method follows previous work AD-NeRF, SSP-NeRF and RAD-NeRF. We provide some HR videos in 900 * 900 resolution. In data preprocessing, please downsample them to 450 * 450. Then use the downsampled frames to perform data preprocessing in RAD-NeRF (extract images, detect lands, face parsing, extract background, estimate head pose ...). With the extracted landmarks, extract patches from the eye region and then utilize a ResNet model to extract their features.

  • Finally, file structure after finishing all steps:
    ./data/<ID>
    ├──<ID>.mp4 # original video
    ├──ori_imgs # original images from video
    │  ├──0.png
    │  ├──0.lms # 2D landmarks
    │  ├──...
    ├──hr_imgs # HR ground truth frames (static background)
    │  ├──0.jpg
    │  ├──...
    ├──eye_features # eye patches and features
    │  ├──0_l.png # left eye
    │  ├──0_r.png # right eye
    │  ├──0.pt # eye feature
    │  ├──...
    ├──gt_imgs # ground truth images (static background)
    │  ├──0.jpg
    │  ├──...
    ├──parsing # semantic segmentation
    │  ├──0.png
    │  ├──...
    ├──torso_imgs # inpainted torso images
    │  ├──0.png
    │  ├──...
    ├──aud.wav # original audio 
    ├──aud.npy # audio features (deepspeech)
    ├──bc.jpg # default background
    ├──track_params.pt # raw head tracking results
    ├──transforms_train.json # head poses (train split)
    ├──transforms_val.json # head poses (test split)

Some HR talking videos and processed data can be downloaded at baidudisk.

Usage

The training script is in train.sh.. Here is an example.

Training DaNeRF module:

python main.py data/Sunak/ --workspace trial/Sunak/ -O --iters 70000 --data_range 0 -1 --dim_eye 6 --lr 0.005 --lr_net 0.0005 --num_rays 65536 --patch_size 32

Training DaNeRF and ECSR jointly:

python main_sr.py data/Sunak/ --workspace trial/Sunak/ -O --iters 150000 --data_range 0 -1 --dim_eye 6 --patch_size 32 --srtask --num_rays 16384 --lr 0.005 --lr_net 0.0005 --weight_pcp 0.05 --weight_style 0.01 --weight_gan 0.01 --test_tile 450

with ckpt use --ftsr_path 'trial/Sunak/modelsr_ckpt/sresrnet_17.pth'.

Acknowledgement

This project is developed based on RAD-NeRF of Tang et al and 4K-NeRF of Wang et al. Thanks for these great works.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published