Refining the state-of-the-art in Machine Translation, optimizing NMT for the JA <-> EN language pair by leveraging personal domain expertise
Abstract
Documenting the construction of an NMT (Neural Machine Translation) system for En/Ja based on the Transformer architecture leveraging the OpenNMT framework. A systematic exploration of corpora pre-processing, hyperparameter tuning and model architecture is carried out to obtain optimal performance. The system is evaluated using standard auto-evaluation metrics such as BLEU, and my subjective opinion as a Japanese linguist.
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2022
- DOI:
- 10.48550/arXiv.2202.11669
- arXiv:
- arXiv:2202.11669
- Bibcode:
- 2022arXiv220211669B
- Keywords:
-
- Computer Science - Computation and Language
- E-Print:
- 11 pages, 13 figures