2021.iwslt-1.5@ACL

Total: 1

#1 Without Further Ado: Direct and Simultaneous Speech Translation by AppTek in 2021 [PDF] [Copy] [Kimi1]

Authors: Parnia Bahar ; Patrick Wilken ; Mattia A. Di Gangi ; Evgeny Matusov

This paper describes the offline and simultaneous speech translation systems developed at AppTek for IWSLT 2021. Our offline ST submission includes the direct end-to-end system and the so-called posterior tight integrated model, which is akin to the cascade system but is trained in an end-to-end fashion, where all the cascaded modules are end-to-end models themselves. For simultaneous ST, we combine hybrid automatic speech recognition with a machine translation approach whose translation policy decisions are learned from statistical word alignments. Compared to last year, we improve general quality and provide a wider range of quality/latency trade-offs, both due to a data augmentation method making the MT model robust to varying chunk sizes. Finally, we present a method for ASR output segmentation into sentences that introduces a minimal additional delay.