Total: 1
In this paper, a method of structuring the multi-media recording of a small-sized meeting based on various information such as sound source localization, multiple-talk detection, and the detection of non-speech sound events, is proposed. The information from these detectors is fused by a Bayesian network to estimate the state of the meeting. Based on the estimated state, the recording of the meeting is structured using a XML-based description language and is visualized by a browser.