2024.iwslt-1.8@ACL

Total: 1

#1 HW-TSC’s Speech to Text Translation System for IWSLT 2024 in Indic track [PDF] [Copy] [Kimi] [REL]

Authors: Bin Wei ; Zongyao Li ; Jiaxin Guo ; Daimeng Wei ; Zhanglin Wu ; Xiaoyu Chen ; Zhiqiang Rao ; Shaojun Li ; Yuanchang Luo ; Hengchao Shang ; Hao Yang ; Yanfei Jiang

This article introduces the process of HW-TSC and the results of IWSLT 2024 Indic Track Speech to Text Translation. We designed a cascade system consisting of an ASR model and a machine translation model to translate speech from one language to another. For the ASR part, we directly use whisper large v3 as our ASR model. Our main task is to optimize the machine translation model (en2ta, en2hi, en2bn). In the process of optimizing the translation model, we first use bilingual corpus to train the baseline model. Then we use monolingual data to construct pseudo-corpus data to further enhance the baseline model. Finally, we filter the parallel corpus data through the labse filtering method and finetune the model again, which can further improve the bleu value. We also selected domain data from bilingual corpus to finetune previous model to achieve the best results.