zhang16@interspeech_2016@ISCA

Total: 1

#1 Objective Evaluation Methods for Chinese Text-To-Speech Systems [PDF] [Copy] [Kimi2]

Authors: Teng Zhang ; Zhipeng Chen ; Ji Wu ; Sam Lai ; Wenhui Lei ; Carsten Isert

To objectively evaluate the performance of text-to-speech (TTS) systems, many studies have been conducted in the straightforward way to compare synthesized speech and natural speech with the alignment. However, in most situations, there is no natural speech can be used. In this paper, we focus on machine learning approaches for the TTS evaluation. We exploit a subspace decomposition method to separate different components in speech, which generates distinctive acoustic features automatically. Furthermore, a pairwise based Support Vector Machine (SVM) model is used to evaluate TTS systems. With the original prosodic acoustic features and Support Vector Regression model, we obtain a ranking relevance of 0.7709. Meanwhile, with the proposed oblique matrix projection method and pairwise SVM model, we achieve a much better result of 0.9115.