2408.11926

Total: 1

#1 Defining Boundaries: The Impact of Domain Specification on Cross-Language and Cross-Domain Transfer in Machine Translation [PDF1] [Copy] [Kimi2] [REL]

Authors: Lia Shahnazaryan, Meriem Beloucif

Recent advancements in neural machine translation (NMT) have revolutionized the field, yet the dependency on extensive parallel corpora limits progress for low-resource languages and domains. Cross-lingual transfer learning offers a promising solution by utilizing data from high-resource languages but often struggles with in-domain NMT. This paper investigates zero-shot cross-lingual domain adaptation for NMT, focusing on the impact of domain specification and linguistic factors on transfer effectiveness. Using English as the source language and Spanish for fine-tuning, we evaluate multiple target languages, including Portuguese, Italian, French, Czech, Polish, and Greek. We demonstrate that both language-specific and domain-specific factors influence transfer effectiveness, with domain characteristics playing a crucial role in determining cross-domain transfer potential. We also explore the feasibility of zero-shot cross-lingual cross-domain transfer, providing insights into which domains are more responsive to transfer and why. Our results show the importance of well-defined domain boundaries and transparency in experimental setups for in-domain transfer learning.

Subject: Computation and Language

Publish: 2024-08-21 18:28:48 UTC