Robust speech recognition with speech enhanced deep neural networks

#1 Robust speech recognition with speech enhanced deep neural networks [PDF] [Copy] [Kimi¹] [REL]

Authors: Jun Du, Qing Wang, Tian Gao, Yong Xu, Li-Rong Dai, Chin-Hui Lee

We propose a signal pre-processing front-end to enhance speech based on deep neural networks (DNNs) and use the enhanced speech features directly to train hidden Markov models (HMMs) for robust speech recognition. As a comprehensive study, we examine its effectiveness for different acoustic features, acoustic models, and training-testing combinations. Tested on the Aurora4 task the experimental results indicate that our proposed framework consistently outperform the state-of-the-art speech recognition systems in all evaluation conditions. To our best knowledge, this is the first showcase on the Aurora4 task yielding performance gains by using only an enhancement pre-processor without any adaptation or compensation post-processing on top of the best DNN-HMM system. The word error rate reduction from the baseline system is up to 50% for clean-condition training and 15% for multi-condition training. We believe the system performance could be improved further by incorporating post-processing techniques to work coherently with the proposed enhancement pre-processing scheme.

Subject: INTERSPEECH.2014 - Speech Recognition

du14@interspeech_2014@ISCA

#1 Robust speech recognition with speech enhanced deep neural networks [PDF] [Copy] [Kimi1] [REL]

#1 Robust speech recognition with speech enhanced deep neural networks [PDF] [Copy] [Kimi¹] [REL]