ewert24@interspeech_2024@ISCA

Total: 1

#1 Does the Lombard Effect Matter in Speech Separation? Introducing the Lombard-GRID-2mix Dataset [PDF] [Copy] [Kimi] [REL]

Authors: Iva Ewert ; Marvin Borsdorf ; Haizhou Li ; Tanja Schultz

Inspired by the human ability of selective listening, speech separation aims to equip machines with the capability to disentangle cocktail party soundscapes into the individual sound sources. Recently, neural network based algorithms have been studied to work reliably under various conditions. However, to the best of our knowledge, a change in the speaking style has not yet been studied. The Lombard effect, a reflexive change in speaking style triggered by noisy environments, is a typical behavior in everyday conversational situations. In this work, we introduce a new first of its kind dataset, called Lombard-GRID-2mix, to study speech separation for two-speaker mixtures on normal speech and Lombard speech. In a comprehensive study, we show that speech separation systems can be equipped to work for both normal speech and Lombard speech. We apply a carefully designed finetuning method to enable the system to work even if noise is present in the Lombard speech for different SNR ratios.