SOVA-Bench: Benchmarking the Speech Conversation Ability for LLM-based Voice Assistant

#1 SOVA-Bench: Benchmarking the Speech Conversation Ability for LLM-based Voice Assistant [PDF³] [Copy] [Kimi¹] [REL]

Authors: Yixuan Hou, Heyang Liu, Yuhao Wang, Ziyang Cheng, Ronghua Wu, Qunshan Gu, Yanfeng Wang, Yu Wang

Thanks to the steady progress of large language models (LLMs), speech encoding algorithms and vocoder structure, recent advancements have enabled generating speech response directly from a user instruction. However, benchmarking the generated speech quality has been a neglected but critical issue, considering the shift from the pursuit of semantic accuracy to vivid and spontaneous speech flow. Previous evaluation focused on the speech-understanding ability, lacking a quantification of acoustic quality. In this paper, we propose Speech cOnversational Voice Assistant Benchmark (SOVA-Bench), providing a comprehension comparison of the general knowledge, speech recognition and understanding, along with both semantic and acoustic generative ability between available speech LLMs. To the best of our knowledge, SOVA-Bench is one of the most systematic evaluation frameworks for speech LLMs, inspiring the direction of voice interaction systems.

Subjects: Sound , Audio and Speech Processing

Publish: 2025-06-03 05:21:51 UTC

2506.02457

#1 SOVA-Bench: Benchmarking the Speech Conversation Ability for LLM-based Voice Assistant [PDF3] [Copy] [Kimi1] [REL]

#1 SOVA-Bench: Benchmarking the Speech Conversation Ability for LLM-based Voice Assistant [PDF³] [Copy] [Kimi¹] [REL]