2025.findings-emnlp.561@ACL

Total: 1

#1 Evaluating Automatic Speech Recognition Systems for Korean Meteorological Experts [PDF] [Copy] [Kimi] [REL]

Authors: ChaeHun Park, Hojun Cho, Jaegul Choo

Automatic speech recognition systems often fail on specialized vocabulary in tasks such as weather forecasting. To address this, we introduce an evaluation dataset of Korean weather queries. The dataset was recorded by diverse native speakers following pronunciation guidelines from domain experts and underwent rigorous verification. Benchmarking both open-source models and a commercial API reveals high error rates on meteorological terms. We also explore a lightweight text-to-speech-based data augmentation strategy, yielding substantial error reduction for domain-specific vocabulary and notable improvement in overall recognition accuracy. Our dataset is available at https://huggingface.co/datasets/ddehun/korean-weather-asr.

Subject: EMNLP.2025 - Findings