yong25b@interspeech_2025@ISCA

Total: 1

#1 HK-GenSpeech: A Generative AI Scene Creation Framework for Speech Based Cognitive Assessment [PDF1] [Copy] [Kimi] [REL]

Authors: Vi Jun Sean Yong, Serkan Kumyol, Pau Le Lisa Low, Suk Wai Winnie Leung, Tristan Braud

Current methods of automated speech-based cognitive assessment often rely on fixed-picture descriptions in major languages, limiting repeatability, engagement, and locality. This paper introduces HK-GenSpeech (HKGS), a framework using generative AI to create pictures that present similar features to those used in cognitive assessment, augmented with descriptors reflecting the local context. We demonstrate HKGS through a dataset of 423 Cantonese speech samples collected in Hong Kong from 141 participants, with HK-MoCA scores ranging from 11 to 30. Each participant described the cookie theft picture, an HKGS fixed image, and an HKGS dynamic image. Regression experiments show comparable accuracy for all image types, indicating HKGS' adequacy in generating relevant assessment images. Lexical analysis further suggests that HKGS images elicit richer speech. By mitigating learning effects and improving engagement, HKGS supports broader data collection, particularly in low-resource settings.

Subject: INTERSPEECH.2025 - Analysis and Assessment