Graph Attention Based Multi-Channel U-Net for Speech Dereverberation With Ad-Hoc Microphone Arrays

#1 Graph Attention Based Multi-Channel U-Net for Speech Dereverberation With Ad-Hoc Microphone Arrays [PDF] [Copy] [Kimi] [REL]

Authors: Hongmei Guo, Yijiang Chen, Xiao-Lei Zhang, Xuelong Li

Speech dereverberation with ad-hoc microphone arrays seems not studied sufficiently, particularly in the scenario where the reverberation time is large. In this paper, we propose a novel multi-channel U-Net model for speech dereverberation with ad-hoc microphone arrays, where an attention module is integrated into the model in an end-to-end training manner to do channel selection and fusion. Specifically, we first train a single-channel U-Net model. Then, we replicate the U-Net model to each channel. Finally, we train the attention module for aggregating the information of the channels, where the parameters of the U-Net model are fixed at this stage. To our knowledge, this is the first work that U-Net was used for dereverberation with ad-hoc microphone arrays. We studied two attention mechanism, which are the self-attention and graph-attention; moreover, we integrated the attention module into either the bottleneck layer or the output layer of the multi-channel U-Net, which results in four implementations. Experimental results demonstrate that the proposed method achieves the state-of-the-art performance, and the attention module is very important in channel selection and fusion for improving the performance against long reverberation time.

Subject: INTERSPEECH.2024 - Speech Processing

guo24@interspeech_2024@ISCA

#1 Graph Attention Based Multi-Channel U-Net for Speech Dereverberation With Ad-Hoc Microphone Arrays [PDF] [Copy] [Kimi] [REL]