wang18g@interspeech_2018@ISCA

Total: 1

#1 Robust TDOA Estimation Based on Time-Frequency Masking and Deep Neural Networks [PDF] [Copy] [Kimi1]

Authors: Zhong-Qiu Wang ; Xueliang Zhang ; DeLiang Wang

Deep learning based time-frequency (T-F) masking has dramatically advanced monaural speech separation and enhancement. This study investigates its potential for robust time difference of arrival (TDOA) estimation in noisy and reverberant environments. Three novel algorithms are proposed to improve the robustness of conventional cross-correlation-, beamforming- and subspace-based algorithms for speaker localization. The key idea is to leverage the power of deep neural networks (DNN) to accurately identify T-F units that are relatively clean for TDOA estimation. All of the proposed algorithms exhibit strong robustness for TDOA estimation in environments with low input SNR, high reverberation and low direction-to-reverberant energy ratio.