mdhaffar19@interspeech_2019@ISCA

Total: 1

#1 Qualitative Evaluation of ASR Adaptation in a Lecture Context: Application to the PASTEL Corpus [PDF] [Copy] [Kimi1]

Authors: Salima Mdhaffar ; Yannick Estève ; Nicolas Hernandez ; Antoine Laurent ; Richard Dufour ; Solen Quiniou

Lectures are usually known to be highly specialised in that they deal with multiple and domain specific topics. This context is challenging for Automatic Speech Recognition (ASR) systems since they are sensitive to topic variability. Language Model (LM) adaptation is a commonly used technique to address the mismatch problem between training and test data. In this paper, we are interested in a qualitative analysis in order to relevantly compare the accuracy of the LM adaptation. While word error rate is the most common metric used to evaluate ASR systems, we consider that this metric cannot provide accurate information. Consequently, we explore the use of other metrics based on individual word error rate, indexability, and capability of building relevant requests for information retrieval from the ASR outputs. Experiments are carried out on the PASTEL corpus, a new dataset in French language, composed of lecture recordings, manual chaptering, manual transcriptions, and slides. While an adapted LM allows us to reduce the global classical word error rate by 15.62% in relative, we show that this reduction reaches 44.2% when computed on relevant words only. These observations are confirmed with the high LM adaptation gains obtained with indexability and information retrieval metrics.