chen24m@interspeech_2024@ISCA

Total: 1

#1 Parameter-Efficient Adapter Based on Pre-trained Models for Speech Translation [PDF] [Copy] [Kimi] [REL]

Authors: Nan Chen ; Yonghe Wang ; Feilong Bao

Multi-task learning (MTL) approach leverages pre-trained models in speech and machine translation and has significantly advanced speech-to-text translation tasks. However, it introduces a considerable number of parameters, leading to increasing training costs. Most parameter-efficient fine-tuning (PEFT) methods only train additional modules to effectively reduce the number of trainable parameters. Nevertheless, the increase in trainable parameters caused by the PEFT method remains non-negligible in multilingual speech translation settings. In this paper, we first propose the parameter-sharing adapter, which reduces parameters by 7/8 compared to regular adapters, with only approximately 0.7% performance decrease. For the balance between model parameter quantity and performance, we present a neural network search (NAS) based model. Experimental results revealed that the performance of adapter is closest to fine-tuning, while LoRA exhibits the poorest performance.