choi25h@interspeech_2025@ISCA

Total: 1

#1 Comparative Evaluation of Acoustic Feature Extraction Tools for Clinical Speech Analysis [PDF] [Copy] [Kimi] [REL]

Authors: Anna Seo Gyeong Choi, Alexander Richardson, Ryan Partlan, Sunny X. Tang, Sunghye Cho

This study compares three acoustic feature extraction toolkits—OpenSMILE, Praat, and Librosa—applied to clinical speech data from individuals with schizophrenia spectrum disorders (SSD) and healthy controls (HC). By standardizing extraction parameters across the toolkits, we analyzed speech samples from 77 SSD and 87 HC participants and found significant toolkit-dependent variations. While F0 percentiles showed high cross-toolkit correlation (r=0.962–0.999), measures like F0 standard deviation and formant values often had poor, even negative, agreement. Additionally, correlation patterns differed between SSD and HC groups. Classification analysis identified F0 mean, HNR, and MFCC1 (AUC > 0.70) as promising discriminators. These findings underscore reproducibility concerns and advocate for standardized protocols, multi-toolkit cross-validation, and transparent reporting.

Subject: INTERSPEECH.2025 - Speech Detection