norrenbrock12@interspeech_2012@ISCA

Total: 1

#1 Quality analysis of macroprosodic F0 dynamics in text-to-speech signals [PDF] [Copy] [Kimi1]

Authors: Christoph R. Norrenbrock ; Florian Hinterleitner ; Ulrich Heute ; Sebastian Möller

We present a study on the relation between fundamental frequency (F0) and its perceptual effect in the context of text-to-speech (TTS) synthesis. Features that essentially capture the intonational (macro-prosodic) properties of spoken speech are introduced and analysed with regard to the following questions: (i) How does the prosodic variation of TTS signals differ from natural speech? (ii) Is there a functional relationship between the prosodic variation of TTS signals and its perceived quality? In answering these questions we present novel approaches for the construction of non-intrusive quality estimators. The results reveal a substantial degree of systematic influence of prosodic variation on TTS quality.