Total: 1
Despite the strong research interest in document-level Machine Translation (MT), the test-sets dedicated to this task are still scarce. The existing test-sets mainly cover topics from the general domain and fall short on specialised domains, such as legal and financial. Also, despite their document-level aspect, they still follow a sentence-level logic that doesn’t allow for including certain linguistic phenomena such as information reorganisation. In this work, we aim to fill this gap by proposing a novel test-set : DOLFIN. The dataset is built from specialised financial documents and it makes a step towards true document-level MT by abandoning the paradigm of perfectly aligned sentences, presenting data in units of sections rather than sentences. The test-set consists of an average of 1950 aligned sections for five language pairs. We present the detailed data collection pipeline that can serve as inspiration for aligning new document-level datasets. We demonstrate the usefulness and the quality of this test-set with the evaluation of a series of models. Our results show that the test-set is able to discriminate between context-sensitive and context-agnostic models and shows the weaknesses when models fail to accurately translate financial texts. The test-set will be made public for the community.