Unsupervised Spanish dialect classification

#1 Unsupervised Spanish dialect classification [PDF] [Copy] [Kimi] [REL]

Authors: Rongqing Huang, John H. L. Hansen

Automatic dialect classification has gained interests in the field of speech research because it is important to characterize speaker traits and to estimate knowledge that could improve integrated speech technology (e.g., speech recognition, speaker recognition). This study addresses novel advances in unsupervised spontaneous Latin American Spanish dialect classification. The problem considers the case where no transcripts are available for train and test data, and speakers are talking spontaneously. A technique which aims to find the dialect dependence in the untranscribed audio by selecting the most discriminative Gaussian mixtures and selecting the most discriminative frames of speech is proposed. The Gaussian Mixture Model (GMM) based classifier is retrained after the dialect dependence information is identified. Both the MS-GMM (GMM trained with Mixture Selection) and FS-GMM (GMM trained with Frame Selection) classifiers improve dialect classification performance significantly. Using 122 speakers across three dialects of Spanish with 3.3 hours of speech, the relative error reduction is 30.4% and 26.1% respectively.

Subject: INTERSPEECH.2006 - Language and Multimodal

huang06@interspeech_2006@ISCA

#1 Unsupervised Spanish dialect classification [PDF] [Copy] [Kimi] [REL]