2024.iwslt-1.11@ACL

Total: 1

#1 SimulSeamless: FBK at IWSLT 2024 Simultaneous Speech Translation [PDF] [Copy] [Kimi] [REL]

Authors: Sara Papi ; Marco Gaido ; Matteo Negri ; Luisa Bentivogli

This paper describes the FBK’s participation in the Simultaneous Translation Evaluation Campaign at IWSLT 2024. For this year’s submission in the speech-to-text translation (ST) sub-track, we propose SimulSeamless, which is realized by combining AlignAtt and SeamlessM4T in its medium configuration. The SeamlessM4T model is used ‘off-the-shelf’ and its simultaneous inference is enabled through the adoption of AlignAtt, a SimulST policy based on cross-attention that can be applied without any retraining or adaptation of the underlying model for the simultaneous task. We participated in all the Shared Task languages (English->German, Japanese, Chinese, and Czech->English), achieving acceptable or even better results compared to last year’s submissions. SimulSeamless, covering more than 143 source languages and 200 target languages, is released at: https://github.com/hlt-mt/FBK-fairseq/.