2023.iwslt-1.15@ACL

Total: 1

#1 Submission of USTC’s System for the IWSLT 2023 - Offline Speech Translation Track [PDF] [Copy] [Kimi]

Authors: Xinyuan Zhou ; Jianwei Cui ; Zhongyi Ye ; Yichi Wang ; Luzhen Xu ; Hanyi Zhang ; Weitai Zhang ; Lirong Dai

This paper describes the submissions of the research group USTC-NELSLIP to the 2023 IWSLT Offline Speech Translation competition, which involves translating spoken English into written Chinese. We utilize both cascaded models and end-to-end models for this task. To improve the performance of the cascaded models, we introduce Whisper to reduce errors in the intermediate source language text, achieving a significant improvement in ASR recognition performance. For end-to-end models, we propose Stacked Acoustic-and-Textual En- coding extension (SATE-ex), which feeds the output of the acoustic decoder into the textual decoder for information fusion and to prevent error propagation. Additionally, we improve the performance of the end-to-end system in translating speech by combining the SATE-ex model with the encoder-decoder model through ensembling.