2024.naacl-srw.24@ACL

Total: 1

#1 Multi-Source Text Classification for Multilingual Sentence Encoder with Machine Translation [PDF] [Copy] [Kimi] [REL]

Authors: Reon Kajikawa ; Keiichiro Yamada ; Tomoyuki Kajiwara ; Takashi Ninomiya

To reduce the cost of training models for each language for developers of natural language processing applications, pre-trained multilingual sentence encoders are promising.However, since training corpora for such multilingual sentence encoders contain only a small amount of text in languages other than English, they suffer from performance degradation for non-English languages.To improve the performance of pre-trained multilingual sentence encoders for non-English languages, we propose a method of machine translating a source sentence into English and then inputting it together with the source sentence in a multi-source manner.Experimental results on sentiment analysis and topic classification tasks in Japanese revealed the effectiveness of the proposed method.