Translating from Morphologically Complex Languages: A Paraphrase-Based Approach
Abstract
We propose a novel approach to translating from a morphologically complex language. Unlike previous research, which has targeted word inflections and concatenations, we focus on the pairwise relationship between morphologically related words, which we treat as potential paraphrases and handle using paraphrasing techniques at the word, phrase, and sentence level. An important advantage of this framework is that it can cope with derivational morphology, which has so far remained largely beyond the capabilities of statistical machine translation systems. Our experiments translating from Malay, whose morphology is mostly derivational, into English show significant improvements over rivaling approaches based on five automatic evaluation measures (for 320,000 sentence pairs; 9.5 million English word tokens).
- Publication:
-
arXiv e-prints
- Pub Date:
- September 2021
- DOI:
- 10.48550/arXiv.2109.13724
- arXiv:
- arXiv:2109.13724
- Bibcode:
- 2021arXiv210913724N
- Keywords:
-
- Computer Science - Computation and Language;
- Computer Science - Artificial Intelligence;
- Computer Science - Machine Learning;
- 68T50;
- F.2.2;
- I.2.7
- E-Print:
- machine translation, morphologically complex languages, paraphrases (word, phrase, and sentence level), infelctional morphology, derivational morphology, Malay, Indonesian