pandey23@interspeech_2023@ISCA

Total: 1

#1 Listener sensitivity to deviating obstruents in WaveNet [PDF] [Copy] [Kimi]

Authors: Ayushi Pandey ; Jens Edlund ; Sébastien Le Maguer ; Naomi Harte

This paper investigates the perceptual significance of the deviation in obstruents previously observed in WaveNet vocoders. The study involved presenting stimuli of varying lengths to 128 participants, who were asked to identify whether each stimulus was produced by a human or a machine. The participants' responses were captured using a 2-alternative forced choice task. The study found that while the length of the stimuli did not reliably affect participants' accuracy in the task, the concentration of obstruents did have a significant effect. Participants were consistently more accurate in identifying WaveNet stimuli as machine when the phrases were obstruent-rich. These findings show that the deviation in obstruents reported in WaveNet voices is perceivable by human listeners. The test protocol may be of wider utility in TTS.