Improving Pretrained YAMNet for Enhanced Speech Command Detection via Transfer Learning

#1 Improving Pretrained YAMNet for Enhanced Speech Command Detection via Transfer Learning [PDF¹] [Copy] [Kimi¹] [REL]

Authors: Sidahmed Lachenani, Hamza Kheddar, Mohamed Ouldzmirli

This work addresses the need for enhanced accuracy and efficiency in speech command recognition systems, a critical component for improving user interaction in various smart applications. Leveraging the robust pretrained YAMNet model and transfer learning, this study develops a method that significantly improves speech command recognition. We adapt and train a YAMNet deep learning model to effectively detect and interpret speech commands from audio signals. Using the extensively annotated Speech Commands dataset (speech_commands_v0.01), our approach demonstrates the practical application of transfer learning to accurately recognize a predefined set of speech commands. The dataset is meticulously augmented, and features are strategically extracted to boost model performance. As a result, the final model achieved a recognition accuracy of 95.28%, underscoring the impact of advanced machine learning techniques on speech command recognition. This achievement marks substantial progress in audio processing technologies and establishes a new benchmark for future research in the field.

Subjects: Sound , Artificial Intelligence , Audio and Speech Processing

Publish: 2025-04-26 21:57:11 UTC

2504.19030

#1 Improving Pretrained YAMNet for Enhanced Speech Command Detection via Transfer Learning [PDF1] [Copy] [Kimi1] [REL]

#1 Improving Pretrained YAMNet for Enhanced Speech Command Detection via Transfer Learning [PDF¹] [Copy] [Kimi¹] [REL]