saeki20@interspeech_2020@ISCA

Total: 1

#1 Real-Time, Full-Band, Online DNN-Based Voice Conversion System Using a Single CPU [PDF] [Copy] [Kimi1]

Authors: Takaaki Saeki ; Yuki Saito ; Shinnosuke Takamichi ; Hiroshi Saruwatari

We present a real-time, full-band, online voice conversion (VC) system that uses a single CPU. For practical applications, VC must be high quality and able to perform real-time, online conversion with fewer computational resources. Our system achieves this by combining non-linear conversion with a deep neural network and short-tap, sub-band filtering. We evaluate our system and demonstrate that it 1) achieves the estimated complexity around 2.5 GFLOPS and measures real-time factor (RTF) around 0.5 with a single CPU and 2) can attain converted speech with a 3.4 / 5.0 mean opinion score (MOS) of naturalness.