Skip to content

roger-tseng/CodecDetect

Repository files navigation

Train Fake Speech Detectors on CodecFake

Paper, Dataset, Project Page

Interspeech 2024

TL;DR: We show that better detection of deepfake speech from codec-based TTS systems can be achieved by training models on speech re-synthesized with neural audio codecs. We also release the CodecFake dataset for this purpose.

Data Download

See [here]. If using ZIP files is preferred, please use this commit (3abd4aa).

Environment

requirements.txt must be installed for execution. We state our experiment environment for those who prefer to simulate as similar as possible.

pip install -r requirements.txt
  • Our environment (for GPU training)
    • Python 3.8.18
    • GCC 11.2.0
    • GPU: 1 NVIDIA Tesla V100 32GB
    • gpu-driver: 470.161.03

Running

  1. Training

About 32GB GPU RAM is required to train AASIST using a batch size of 32.
Available codecs are listed in the script.

bash train.sh <codec_name>
  1. Evaluation

First, add paths to trained checkpoints to the script. Then adjust subsets to evaluate on.

bash eval_all.sh

Acknowledgements

This repository is built on top of several open source projects.

About

Fake speech detection with the CodecFake dataset

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published