kinoshita17@interspeech_2017@ISCA

Total: 1

#1 Neural Network-Based Spectrum Estimation for Online WPE Dereverberation [PDF] [Copy] [Kimi2]

Authors: Keisuke Kinoshita ; Marc Delcroix ; Haeyong Kwon ; Takuma Mori ; Tomohiro Nakatani

In this paper, we propose a novel speech dereverberation framework that utilizes deep neural network (DNN)-based spectrum estimation to construct linear inverse filters. The proposed dereverberation framework is based on the state-of-the-art inverse filter estimation algorithm called weighted prediction error (WPE) algorithm, which is known to effectively reduce reverberation and greatly boost the ASR performance in various conditions. In WPE, the accuracy of the inverse filter estimation, and thus the dereverberation performance, is largely dependent on the estimation of the power spectral density (PSD) of the target signal. Therefore, the conventional WPE iteratively performs the inverse filter estimation, actual dereverberation and the PSD estimation to gradually improve the PSD estimate. However, while such iterative procedure works well when sufficiently long acoustically-stationary observed signals are available, WPE’s performance degrades when the duration of observed/accessible data is short, which typically is the case for real-time applications using online block-batch processing with small batches. To solve this problem, we incorporate the DNN-based spectrum estimator into the framework of WPE, because a DNN can estimate the PSD robustly even from very short observed data. We experimentally show that the proposed framework outperforms the conventional WPE, and improves the ASR performance in real noisy reverberant environments in both single-channel and multichannel cases.