chen17i@interspeech_2017@ISCA

Total: 1

#1 Unmixing Convolutive Mixtures by Exploiting Amplitude Co-Modulation: Methods and Evaluation on Mandarin Speech Recordings [PDF] [Copy] [Kimi1]

Authors: Bo-Rui Chen ; Huang-Yi Lee ; Yi-Wen Liu

This paper presents and evaluates two frequency-domain methods for multi-channel sound source separation. The sources are assumed to couple to the microphones with unknown room responses. Independent component analysis (ICA) is applied in the frequency domain to obtain maximally independent amplitude envelopes (AEs) at every frequency. Due to the nature of ICA, the AEs across frequencies need to be de-permuted. To this end, we seek to assign AEs to the same source solely based on the correlation in their magnitude variation against time. The resulted time-varying spectra are inverse Fourier transformed to synthesize separated signals. Objective evaluation showed that both methods achieve a signal-to-interference ratio (SIR) that is comparable to Mazur et al (2013). In addition, we created spoken Mandarin materials and recruited age-matched subjects to perform word-by-word transcription. Results showed that, first, speech intelligibility significantly improved after unmixing. Secondly, while both methods achieved similar SIR, the subjects preferred to listen to the results that were post-processed to ensure a speech-like spectral shape; the mean opinion scores were 2.9 vs. 4.3 (out of 5) between the two methods. The present results may provide suggestions regarding deployment of the correlation-based source separation algorithms into devices with limited computational resources.