loweimi25@interspeech_2025@ISCA

Total: 1

#1 Zero-Shot Speech-Based Depression and Anxiety Assessment with LLMs [PDF] [Copy] [Kimi] [REL]

Authors: Erfan Loweimi, Sofia de la Fuente Garcia, Saturnino Luz

The use of Large Language Models (LLMs) for psychological state assessment from speech has gained significant interest, particularly in analysing and predicting mental health. In this paper, we explore the potential of eight instruct-tuned LLMs (Llama-3.1-8B, Ministral, Gemma-2-9B, Phi-4, Mistral, DeepSeek-Qwen, QwQ-Preview and Llama-3.3-70B) in a zero-shot setting to predict Hospital Anxiety and Depression Scale (HADS) depression and anxiety scores from one-to-two minute spontaneous speech recordings from the PsyVoiD database. We evaluate how transcript quality affects LLM responses by comparing performance using ground-truth transcriptions versus transcripts generated by Whisper models of different sizes. Spearman correlation coefficients and statistical analysis demonstrate significant and notable potential of the LLMs to predict psychological states in a zero-shot setting.

Subject: INTERSPEECH.2025 - Speech Detection