Fusion of contrastive acoustic models for parallel phonotactic spoken language identification

#1 Fusion of contrastive acoustic models for parallel phonotactic spoken language identification [PDF] [Copy] [Kimi] [REL]

This paper investigates combining contrastive acoustic models for parallel phonotactic language identification systems. PRLM, a typical phonotactic system, uses a phone recogniser to extract phonotactic information from the speech data. Combining multiple PRLM systems together forms a Parallel PRLM (PPRLM) system. A standard PPRLM system utilises multiple phone recognisers trained on different languages and phone sets to provide diversification. In this paper, a new approach for PPRLM is proposed where phone recognisers with different acoustic models are used for the parallel systems. The STC and SPAM precision matrix modelling schemes as well as the MMI training criterion are used to produce contrastive acoustic models. Preliminary experimental results are reported on the NIST language recognition evaluation sets. With only two training corpora, a 12-way PPRLM system, using different acoustic modelling schemes, outperformed the standard 2-way PPRLM system by 2.0-5.0% absolute EER.

Subject: INTERSPEECH.2007 - Language and Multimodal

sim07@interspeech_2007@ISCA

#1 Fusion of contrastive acoustic models for parallel phonotactic spoken language identification [PDF] [Copy] [Kimi] [REL]