Modeling Homophone Noise for Robust Neural Machine Translation
Abstract
In this paper, we propose a robust neural machine translation (NMT) framework. The framework consists of a homophone noise detector and a syllable-aware NMT model to homophone errors. The detector identifies potential homophone errors in a textual sentence and converts them into syllables to form a mixed sequence that is then fed into the syllable-aware NMT. Extensive experiments on Chinese->English translation demonstrate that our proposed method not only significantly outperforms baselines on noisy test sets with homophone noise, but also achieves a substantial improvement on clean text.
- Publication:
-
arXiv e-prints
- Pub Date:
- December 2020
- DOI:
- 10.48550/arXiv.2012.08396
- arXiv:
- arXiv:2012.08396
- Bibcode:
- 2020arXiv201208396Q
- Keywords:
-
- Computer Science - Computation and Language