nakashika17b@interspeech_2017@ISCA

Total: 1

#1 Complex-Valued Restricted Boltzmann Machine for Direct Learning of Frequency Spectra [PDF] [Copy] [Kimi1]

Authors: Toru Nakashika ; Shinji Takaki ; Junichi Yamagishi

In this paper, we propose a new energy-based probabilistic model where a restricted Boltzmann machine (RBM) is extended to deal with complex-valued visible units. The RBM that automatically learns the relationships between visible units and hidden units (but without connections in the visible or the hidden units) has been widely used as a feature extractor, a generator, a classifier, pre-training of deep neural networks, etc. However, all the conventional RBMs have assumed the visible units to be either binary-valued or real-valued, and therefore complex-valued data cannot be fed to the RBM. In various applications, however, complex-valued data is frequently used such examples include complex spectra of speech, fMRI images, wireless signals, and acoustic intensity. For the direct learning of such the complex-valued data, we define the new model called “complex-valued RBM (CRBM)” where the conditional probability of the complex-valued visible units given the hidden units forms a complex-Gaussian distribution. Another important characteristic of the CRBM is to have connections between real and imaginary parts of each of the visible units unlike the conventional real-valued RBM. Our experiments demonstrated that the proposed CRBM can directly encode complex spectra of speech signals without decoupling imaginary number or phase from the complex-value data.