ISCA Archive Blizzard 2009
ISCA Archive Blizzard 2009

The WISTON Text-to-Speech System for Blizzard Challenge 2009

Jianhua Tao, Ya Li, Shifeng Pan, Meng Zhang, Hongjun Sun, Zhengqi Wen

This paper describes the WISTON system, a large corpus based TTS system that was submitted to Blizzard Challenge 2009. The text analysis part of this system contains text preprocessing, word segmentation, POS tagging, phonetic transcription and prosody structure prediction, most of which are based on Maximum Entropy (ME) models. In unit selection part, CART models are used to predict the prosodic parameters (duration, F0, energy), then we use concatenation costs and target costs for path searching to find the most suitable units to concatenate. The acoustic processing part is used for smoothing. The final system was used to attend Blizzard Challenge 2009 for both English test and Mandarin test.