fotedar17@interspeech_2017@ISCA

Total: 1

#1 An Information Theoretic Analysis of the Temporal Synchrony Between Head Gestures and Prosodic Patterns in Spontaneous Speech [PDF] [Copy] [Kimi1]

Authors: Gaurav Fotedar ; Prasanta Kumar Ghosh

We analyze the temporal co-ordination between head gestures and prosodic patterns in spontaneous speech in a data-driven manner. For this study, we consider head motion and speech data from 24 subjects while they tell a fixed set of five stories. The head motion, captured using a motion capture system, is converted to Euler angles and translations in X, Y and Z-directions to represent head gestures. Pitch and short-time energy in voiced segments are used to represent the prosodic patterns. To capture the statistical relationship between head gestures and prosodic patterns, mutual information (MI) is computed at various delays between the two using data from 24 subjects in six native languages. The estimated MI, averaged across all subjects, is found to be maximum when the head gestures lag the prosodic patterns by 30msec. This is found to be true when subjects tell stories in English as well as in their native language. We observe a similar pattern in the root mean squared error of predicting head gestures from prosodic patterns using Gaussian mixture model. These results indicate that there could be an asynchrony between head gestures and prosody during spontaneous speech where head gestures follow the corresponding prosodic patterns.