hojo19@interspeech_2019@ISCA

Total: 1

#1 Evaluating Intention Communication by TTS Using Explicit Definitions of Illocutionary Act Performance [PDF] [Copy] [Kimi1]

Authors: Nobukatsu Hojo ; Noboru Miyazaki

Text-to-speech (TTS) synthesis systems have been evaluated with respect to attributes such as quality, naturalness and intelligibility. However, an evaluation protocol with respect to communication of intentions has not yet been established. Evaluating this sometimes produce unreliable results because participants can misinterpret definitions of intentions. This misinterpretation is caused by the colloquial and implicit description of intentions. To address this problem, this work explicitly defines each intention following theoretical definitions, “felicity conditions”, in speech-act theory. We define the communication of each intention with one to four necessary and sufficient conditions to be satisfied. In listening tests, participants rated whether each condition was satisfied or not. We compared the proposed protocol with the conventional baseline using four different voice conditions; neutral TTS, conversational TTS w/ and w/o intention inputs, and recorded speech. The experimental results with 10 participants showed that the proposed protocol produced smaller within-group variation and larger between-group variation. These results indicate that the proposed protocol can be used to evaluate intention communication with higher inter-rater reliability and sensitivity.