Inventory-based audio-visual speech enhancement | Cool Papers

#1 Inventory-based audio-visual speech enhancement [PDF] [Copy] [Kimi¹] [REL]

Authors: Dorothea Kolossa, Robert Nickel, Steffen Zeiler, Rainer Martin

In this paper we propose to combine audio-visual speech recognition with inventory-based speech synthesis for speech enhancement. Unlike traditional filtering-based speech enhancement, inventory-based speech synthesis avoids the usual trade-off between noise reduction and consequential speech distortion. For this purpose, the processed speech signal is composed from a given speech inventory which contains snippets of speech from a targeted speaker. However, the combination of speech recognition and synthesis is susceptible to noise as recognition errors can lead to a suboptimal selection of speech segments. The search for fitting clean speech segments can be significantly improved when audio-visual information is utilized by means of a coupled HMM recognizer and an uncertainty decoding framework. First results using this novel system are reported in terms of several instrumental measures for three types of noise.

Subject: INTERSPEECH.2012 - Speech Processing

kolossa12@interspeech_2012@ISCA

#1 Inventory-based audio-visual speech enhancement [PDF] [Copy] [Kimi1] [REL]

#1 Inventory-based audio-visual speech enhancement [PDF] [Copy] [Kimi¹] [REL]