Add LJU dataset, pretrained model and more#670
Add LJU dataset, pretrained model and more#670dathudeptrai merged 9 commits intoTensorSpeech:masterfrom
Conversation
- Remove Multi-Band PWGAN + Add Multi-Band MelGAN-HF + Add LJSpeech Ultimate processor + Add postnet extraction scripts for Tac2 and FS2. + Add configs.
|
@ZDisket Is it finished ? |
|
@dathudeptrai Almost done, except for two things: add instructions to Tacotron2 readme to use MFA with mfa_extraction for generating durations which will then be turned into masks, and a script to phonemize filelists as the LJSpeechUltimate processor takes in a file called I was thinking I could either do the PR without the filelist phonemizing tool and instead link it in the Tumblr blog post, or include it here, but I'm not sure. |
|
@ZDisket can you use |
…s strings in processing
|
@dathudeptrai It's done. Now the processor will automatically phonemize strings when processing, and inference is as easy as this: I've tested everything I could think of so it should be ready to merge, although you are free to look at it if you've got the time. Four eyes better than two. |
|
LGTM :D |
commit 1368771 Merge: ab6efe4 07b49e9 Author: dathudeptrai <43868663+dathudeptrai@users.noreply.github.com> Date: Thu Mar 10 15:34:25 2022 +0700 Merge pull request TensorSpeech#748 from NeonBohdan/gt-repo Error german_transliterate only when using german_cleaners commit 07b49e9 Author: NeonBohdan <bohdan@neon.ai> Date: Mon Mar 7 13:49:21 2022 +0200 Fix german_transliterate module error commit ab6efe4 Merge: 05b059e d0e7d72 Author: dathudeptrai <43868663+dathudeptrai@users.noreply.github.com> Date: Thu Feb 10 11:28:28 2022 +0700 Merge pull request TensorSpeech#742 from hertz-pj/japenese Support Japenese TTS, and fix some bug. commit d0e7d72 Merge: eb6db12 05b059e Author: dathudeptrai <43868663+dathudeptrai@users.noreply.github.com> Date: Tue Feb 8 19:05:27 2022 +0700 Merge branch 'master' into japenese commit eb6db12 Author: hertz-pj <peiji.yang@foxmail.com> Date: Tue Feb 8 16:33:44 2022 +0800 fix a japenese fastspeech bug commit 3f921a5 Author: dathudeptrai <nguyenquananhminh@gmail.com> Date: Mon Jan 24 20:34:29 2022 +0700 😭 Upgrade to TF 2.7.0 commit c8f6b38 Author: hertz-pj <peiji.yang@foxmail.com> Date: Fri Jan 28 09:17:53 2022 +0800 fix bug of jsut dataset, add pyopenjtalk to setup.py commit 691c76a Author: hertz-pj <peiji.yang@foxmail.com> Date: Tue Jan 25 15:56:15 2022 +0800 resolve the conflicts commit c6ce93c Author: hertz-pj <peiji.yang@foxmail.com> Date: Mon Jan 24 16:16:00 2022 +0800 Support Japenese TTS commit 05b059e Merge: 070f9cd 9260b7f Author: dathudeptrai <43868663+dathudeptrai@users.noreply.github.com> Date: Sun Feb 6 12:21:26 2022 +0700 Merge pull request TensorSpeech#738 from hertz-pj/japenese Add support for Japenese TTS with JSUT dataset commit 9260b7f Merge: 0d05c18 070f9cd Author: hertz <peiji.yang@foxmail.com> Date: Sat Jan 29 23:32:36 2022 +0800 Merge branch 'TensorSpeech:master' into japenese commit 0d05c18 Author: hertz-pj <peijiyang@foxmail.com> Date: Fri Jan 28 09:17:53 2022 +0800 fix bug of jsut dataset, add pyopenjtalk to setup.py commit e771444 Author: hertz-pj <peijiyang@foxmail.com> Date: Tue Jan 25 15:56:15 2022 +0800 resolve the conflicts commit 070f9cd Author: dathudeptrai <nguyenquananhminh@gmail.com> Date: Mon Jan 24 20:34:29 2022 +0700 😭 Upgrade to TF 2.7.0 commit 4b3bc31 Author: hertz-pj <peijiyang@foxmail.com> Date: Mon Jan 24 16:16:00 2022 +0800 Support Japenese TTS commit 34358d8 Merge: 8786f59 cd3a5e1 Author: dathudeptrai <nguyenquananhminh@gmail.com> Date: Tue Oct 19 20:24:30 2021 +0700 Merge branch 'master' of https://github.com/TensorSpeech/TensorFlowTTS commit 8786f59 Author: dathudeptrai <nguyenquananhminh@gmail.com> Date: Tue Oct 19 20:20:33 2021 +0700 👜 Update README commit cd3a5e1 Merge: b77dffe 59e27bd Author: dathudeptrai <43868663+dathudeptrai@users.noreply.github.com> Date: Tue Sep 21 08:59:31 2021 +0700 Merge pull request TensorSpeech#670 from ZDisket/lju Add LJU dataset, pretrained model and more commit 59e27bd Author: ZDisket <30500847+ZDisket@users.noreply.github.com> Date: Sun Sep 19 10:11:51 2021 -0300 📈Reformat with black commit 2883b6e Author: ZDisket <30500847+ZDisket@users.noreply.github.com> Date: Sun Sep 19 10:05:47 2021 -0300 🔌LJU processor now takes in filelist.txt and automatically ARPAbetizes strings in processing commit 4ce7de9 Author: ZDisket <30500847+ZDisket@users.noreply.github.com> Date: Sat Sep 18 01:52:51 2021 -0300 📑 Document FAL on Tacotron2 commit 221f1cd Author: ZDisket <30500847+ZDisket@users.noreply.github.com> Date: Thu Sep 16 00:23:11 2021 -0300 🌱 Add duration to mask exporter, modify Tacotron2 and dataloader to accept commit 5b15bb9 Author: ZDisket <30500847+ZDisket@users.noreply.github.com> Date: Wed Sep 15 13:03:29 2021 -0300 🚩 Adjust configs and readmes commit 2493011 Merge: 7c1a0d9 b77dffe Author: ZDisket <30500847+ZDisket@users.noreply.github.com> Date: Wed Sep 15 12:13:53 2021 -0300 Merge remote-tracking branch 'upstream/master' into lju commit 7c1a0d9 Merge: a4b3d64 2959501 Author: ZDisket <30500847+ZDisket@users.noreply.github.com> Date: Tue Aug 17 21:49:15 2021 -0300 Merge remote-tracking branch 'upstream/master' into lju commit a4b3d64 Author: ZDisket <30500847+ZDisket@users.noreply.github.com> Date: Tue Aug 10 06:32:45 2021 -0300 😏 Add LJU to AutoProcessor commit 9f288f8 Author: ZDisket <30500847+ZDisket@users.noreply.github.com> Date: Mon Aug 9 01:32:04 2021 -0300 🌙 Add and remove many things along with LJU processor - Remove Multi-Band PWGAN + Add Multi-Band MelGAN-HF + Add LJSpeech Ultimate processor + Add postnet extraction scripts for Tac2 and FS2. + Add configs.
This PR includes:
1. LJSpeech Ultimate dataloader and pretrained Tacotron2 and vocoder (audio samples)
2. Forced Alignment Guided Attention Loss (FAL) from paper (WIP)
3. Multi-Band MelGAN-HF (mb melgan g + hifigan d), with both train from scratch and proven finetuning configurations. Also remove multiband_pwgan
The pretrained model was trained for 100k steps with regular training and then for 20k with technique described in 2. As far as I am aware, this would make TensorFlowTTS the first and only open-source TTS repo with a high sampling rate (44.1KHz) pretrained model available.