An efficient segment-based speech compression technique for hand-held TTS systems

#1 An efficient segment-based speech compression technique for hand-held TTS systems [PDF] [Copy] [Kimi] [REL]

Authors: Chang-Heon Lee, Sung-Kyo Jung, Thomas Eriksson, Won-Suk Jun, Hong-Goo Kang

This paper proposes a novel segment-based speech coding algorithm to efficiently compress the database for concatenative text-to-speech (TTS) systems. To achieve a high compression ratio and meet the fundamental requirements of concatenative TTS synthesizers, i.e. partial segment decoding and random access capability, we adopt a modified analysis-by-synthesis scheme. The spectral coefficients are quantized by a length-based interpolation method and excitation signals are modeled with both non-predictive and predictive approaches. Considering that pitch pulse waveforms of a specific speaker show low intra-variation, the conventional adaptive codebook for pitch prediction is replaced by a speaker dependent pitch-pulse codebook. By applying the proposed algorithm to a hand-held Korean TTS system, we verify that the proposed coder provides a compression ratio of about 1/13, a low complexity of around 1.2 WMOPS, and random access capability.

Subject: INTERSPEECH.2006 - Speech Processing

lee06@interspeech_2006@ISCA

#1 An efficient segment-based speech compression technique for hand-held TTS systems [PDF] [Copy] [Kimi] [REL]