vaiciunas05@interspeech_2005@ISCA

Total: 1

#1 Review of statistical modeling of highly inflected lithuanian using very large vocabulary [PDF] [Copy] [Kimi]

Authors: Airenas Vaiciunas ; Gailius Raskinis

This paper presents state of the art language modeling (LM) of Lithuanian, which is highly inflected free word order language. Perplexities and word error rates (WER) of standard n-gram, class-based, cache-based, topic mixture and morphological LMs were estimated and compared for the vocabulary of more than 1 million words. WER estimates were obtained by solving a speakerdependent ASR task where LMs were used to rescore acoustical hypothesis. LM perplexity appeared to be uncorrelated with WER. Cache-based language models resulted in the greatest perplexity improvement, while class-based language models achieved the greatest though insignificant WER improvement over the baseline 3-gram.