2008.iwslt-evaluation.2@ACL

Total: 1

#1 The CMU syntax-augmented machine translation system: SAMT on Hadoop with n-best alignments. [PDF] [Copy] [Kimi1]

Authors: Andreas Zollmann ; Ashish Venugopal ; Stephan Vogel

We present the CMU Syntax Augmented Machine Translation System that was used in the IWSLT-08 evaluation campaign. We participated in the Full-BTEC data track for Chinese-English translation, focusing on transcript translation. For this year’s evaluation, we ported the Syntax Augmented MT toolkit [1] to the Hadoop MapReduce [2] parallel processing architecture, allowing us to efficiently run experiments evaluating a novel “wider pipelines” approach to integrate evidence from N -best alignments into our translation models. We describe each step of the MapReduce pipeline as it is implemented in the open-source SAMT toolkit, and show improvements in translation quality by using N-best alignments in both hierarchical and syntax augmented translation systems.