INTERSPEECH.2005 - Keynote

Total: 3

#1 The multiple-channel cochlear implant: interfacing electronic technology to human consciousness [PDF] [Copy] [Kimi]

Author: Graeme M. Clark

Fundamental research on electrical stimulation of the auditory pathways resulted in the Multiple Channel Cochlear Implant, a device which provides understanding of speech to severely-toprofoundly deaf people. The device, a miniaturized receiverstimulator with multiple electrodes fed with power and speech data through two separate aerials was first implanted in a patient in 1978 as a prototype, and since 1982, was commercially produced by Cochlear Limited, Australia. Speech processing is based on the discovery that the sensation at each electrode is "vowel-like". Initially, the second formant was coded as a place of stimulation, the sound pressure was coded as a current level, and the voicing frequency as a pulse rate. Further research showed that there were progressively better open-set word and sentence scores for the extraction of the first formant in addition to the second formant (the F0/F1/F2 processor), the addition of high fixed filter outputs (MULTIPEAK) and then finally 6 to 8 maximal filter outputs at low rates (SPEAK) and high rates (ACE). All the frequencies were coded on a place basis. World trials completed for the US FDA on late-deafened adults in 1985 and in 1990 on children from two years to 17 years proved that a 22-channel cochlear implant was safe and effective in enabling them to understand speech both with and without lip-reading.

#2 Linear models for structure prediction [PDF] [Copy] [Kimi]

Author: Fernando C. N. Pereira

Over the last few years, several groups have been developing models and algorithms for learning to predict the structure of complex data, sequences in particular, that extend well-known linear classification models and algorithms, such as logistic regression, the perceptron algorithm, and support vector machines. These methods combine the advantages of discriminative learning with those of probabilistic generative models like HMMs and probabilistic context-free grammars. I will introduce linear models for structure prediction and their simplest learning algorithms, and exemplify their benefits with applications to text and speech processing, including information extraction, parsing, and language modeling.

#3 Spontaneous speech: how people really talk and why engineers should care [PDF] [Copy] [Kimi]

Author: Elizabeth Shriberg

Spontaneous conversation is optimized for human-human communication, but differs in some important ways from the types of speech for which human language technology is often developed. This overview describes four fundamental properties of spontaneous speech that present challenges for spoken language applications because they violate assumptions often applied in automatic processing technology.