2507.01356

Total: 1

#1 Voice Conversion for Likability Control via Automated Rating of Speech Synthesis Corpora [PDF1] [Copy] [Kimi] [REL]

Authors: Hitoshi Suda, Shinnosuke Takamichi, Satoru Fukayama

Perceived voice likability plays a crucial role in various social interactions, such as partner selection and advertising. A system that provides reference likable voice samples tailored to target audiences would enable users to adjust their speaking style and voice quality, facilitating smoother communication. To this end, we propose a voice conversion method that controls the likability of input speech while preserving both speaker identity and linguistic content. To improve training data scalability, we train a likability predictor on an existing voice likability dataset and employ it to automatically annotate a large speech synthesis corpus with likability ratings. Experimental evaluations reveal a significant correlation between the predictor's outputs and human-provided likability ratings. Subjective and objective evaluations further demonstrate that the proposed approach effectively controls voice likability while preserving both speaker identity and linguistic content.

Subjects: Audio and Speech Processing , Sound

Publish: 2025-07-02 04:47:43 UTC