Acoustic normalization of children's speech

Stemmer, Georg; Hacker, Christian; Steidl, Stefan; Nöth, Elmar

doi:10.21437/Eurospeech.2003-415

Young speakers are not represented adequately in current speech recognizers. In this paper we focus on the problem to adapt the acoustic frontend of a speech recognizer which has been trained on adults' speech to achieve a better performance on speech from children. We introduce and evaluate a method to perform non-linear VTLN by an unconstrained data-driven optimization of the filterbank. A second approach normalizes the speaking rate of the young speakers with the PSOLA algorithm. Significant reductions in word error rate have been achieved.

Acoustic normalization of children's speech

Georg Stemmer, Christian Hacker, Stefan Steidl, Elmar Nöth