Total: 1
The Middle East is characterized by remarkable linguistic diversity, with over 400 million inhabitants speaking more than 60 languages across multiple language families. This study presents a pioneering work in developing the first parallel corpora for eight severely under-resourced varieties in the region–PARME, addressing fundamental challenges in low-resource scenarios including non-standardized writing and dialectal complexity. Through an extensive community-driven initiative, volunteers contributed to the creation of over 36,000 translated sentences, marking a significant milestone in resource development. We evaluate machine translation capabilities through zero-shot approaches and fine-tuning experiments with pretrained machine translation models and provide a comprehensive analysis of limitations. Our findings reveal significant gaps in existing technologies for processing the selected languages, highlighting critical areas for improvement in language technology for Middle Eastern languages.