This is the official code for DiffTED: One-shot Audio-driven TED Talk Video Generation with Diffusion-based Co-speech Gestures
To install dependencies run:
-
Install packages
pip install -r requirements.txt
-
Install Thin-Plate-Spline-Motion-Model
Follow directions at install https://github.com/yoyo-nb/Thin-Plate-Spline-Motion-Model/
Place repository at
scripts/tps/
Follow instructions from MRAA to download TED-talks dataset.
To train model run:
python scripts/train_tpsm.py --config config/pose_diffusion.ymlTo run generate videos run:
python scripts/test_tpsm.py long <checkpoint_path> <test_data_path>If you find our work useful, please kindly cite as:
@InProceedings{Hogue2024,
author = {Hogue, Steven and Zhang, Chenxu and Daruger, Hamza and Tian, Yapeng and Guo, Xiaohu},
title = {DiffTED: One-shot Audio-driven TED Talk Video Generation with Diffusion-based Co-speech Gestures},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2024},
pages = {1922-1931}
}- The codebase is developed based on DiffGesture of Zhu et al.