kumar15b@interspeech_2015@ISCA

Total: 1

#1 Confidence-features and confidence-scores for ASR applications in arbitration and DNN speaker adaptation [PDF] [Copy] [Kimi2]

Authors: Kshitiz Kumar ; Ziad Al Bawab ; Yong Zhao ; Chaojun Liu ; Benoit Dumoulin ; Yifan Gong

Speech recognition confidence-scores quantitatively represent correctness of decoded utterances in a [0,1] range. Confidences have primarily been used to filter out recognitions with scores below a threshold. They have also been used in other speech applications in e.g. arbitration, ROVER, and high-quality data selection for model training etc. Confidence-scores are computed from a rich set of confidence-features in the speech recognition engine. While many speech applications consume confidence scores, we haven't seen adequate focus on directly consuming confidence-features in applications. In this work we build a thesis that additionally consuming confidence-features can provide big gains across confidence-related tasks. We demonstrate this for arbitration application, where we obtain 31% relative reduction in arbitration metric. We additionally demonstrate a novel application of confidence-scores in deep-neural-network (DNN) adaptation, where we strongly improve the relative reduction in word-error-rate (WER) for speaker adaptation on limited data.