allauzen13@interspeech_2013@ISCA

Total: 1

#1 Pre-initialized composition for large-vocabulary speech recognition [PDF] [Copy] [Kimi1]

Authors: Cyril Allauzen ; Michael Riley

This paper describes a modified composition algorithm that is used for combining two finite-state transducers, representing the context-dependent lexicon and the language model respectively, in large vocabulary speech recognition. This algorithm is a hybrid between the static and dynamic expansion of the resultant transducer, which maps from context-dependent phones to words and is searched during decoding. The approach is to pre-compute part of the recognition transducer and leave the balance to be expanded during decoding. This method allows for a fine-grained trade-off between space and time in recognition. For example, the time overhead of purely dynamic expansion can be reduced by over six-fold with only a 20% increase in memory in a collection of large-vocabulary recognition tasks available on the Google Android platform.