2018.iwslt-1.14@ACL

Total: 1

#1 Prompsit’s Submission to the IWSLT 2018 Low Resource Machine Translation Task [PDF] [Copy] [Kimi1]

Author: Víctor M. Sánchez-Cartagena

This paper presents Prompsit Language Engineering’s submission to the IWSLT 2018 Low Resource Machine Translation task. Our submission is based on cross-lingual learning: a multilingual neural machine translation system was created with the sole purpose of improving translation quality on the Basque-to-English language pair. The multilingual system was trained on a combination of in-domain data, pseudo in-domain data obtained via cross-entropy data selection and backtranslated data. We morphologically segmented Basque text with a novel approach that only requires a dictionary such as those used by spell checkers and proved that this segmentation approach outperforms the widespread byte pair encoding strategy for this task.