malisz17@interspeech_2017@ISCA

Total: 1

#1 Controlling Prominence Realisation in Parametric DNN-Based Speech Synthesis [PDF] [Copy] [Kimi1]

Authors: Zofia Malisz ; Harald Berthelsen ; Jonas Beskow ; Joakim Gustafson

This work aims to improve text-to-speech synthesis for Wikipedia by advancing and implementing models of prosodic prominence. We propose a new system architecture with explicit prominence modeling and test the first component of the architecture. We automatically extract a phonetic feature related to prominence from the speech signal in the ARCTIC corpus. We then modify the label files and train an experimental TTS system based on the feature using Merlin, a statistical-parametric DNN-based engine. Test sentences with contrastive prominence on the word-level are synthesised and separate listening tests a) evaluating the level of prominence control in generated speech, and b) naturalness, are conducted. Our results show that the prominence feature-enhanced system successfully places prominence on the appropriate words and increases perceived naturalness relative to the baseline.