xiang25b@interspeech_2025@ISCA

Total: 1

#1 Modeling Formant Dynamics in Mandarin /ai/: Effects of Speech Style and Speech Rate [PDF] [Copy] [Kimi] [REL]

Authors: Yunzhuo Xiang, Jingyi Sun

This study examines the relationship between speech clarity and speech rate by modeling F1/F2 variation in the Mandarin diphthong /ai/ across different durations and speech styles. The corpus includes 20 hours of conversational and 6 hours of read speech. Vowel durations were manually verified, and formant values were extracted using auto-correlation. Generalized additive mixed models (GAMMs) were used to examine the interaction between duration and formants. Results show that both read and slow speeches exhibit higher F1 onset and F2 offset for /ai/. However, read speech has larger formant frequency range, and even at the shortest (20 ms) or longest (200 ms) durations, diphthongs in read speech remained more clearly articulated than those in conversational speech of the same length. This suggests that speech clarity and speech rate may be distinct dimensions rather than interchangeable factors.

Subject: INTERSPEECH.2025 - Others