kawahara08c@interspeech_2008@ISCA

Total: 1

#1 Multi-modal recording, analysis and indexing of poster sessions [PDF] [Copy] [Kimi1]

Authors: Tatsuya Kawahara ; Hisao Setoguchi ; Katsuya Takanashi ; Kentaro Ishizuka ; Shoko Araki

A new project on multi-modal analysis of poster sessions is introduced. We have designed an environment dedicated to recording of poster conversations using multiple sensors, and collected a number of sessions, to which a variety of multi-modal information is annotated, including utterance units for individual speakers, backchannels, nodding, gazing, and pointing. Automatic speaker diarization, that is a combination of speech activity detection and speaker identification, is conducted using a set of distant microphones, and a reasonable performance is obtained. Then, we investigate automatic classification of conversation segments into two modes: presentation mode and question-answer mode. Preliminary experiments show that multi-modal features on nonverbal behaviors play a significant role in the indexing of this kind of conversations.