Phrase Based Language Model for Statistical Machine Translation: Empirical Study
Abstract
Reordering is a challenge to machine translation (MT) systems. In MT, the widely used approach is to apply word based language model (LM) which considers the constituent units of a sentence as words. In speech recognition (SR), some phrase based LM have been proposed. However, those LMs are not necessarily suitable or optimal for reordering. We propose two phrase based LMs which considers the constituent units of a sentence as phrases. Experiments show that our phrase based LMs outperform the word based LM with the respect of perplexity and n-best list re-ranking.
- Publication:
-
arXiv e-prints
- Pub Date:
- January 2015
- DOI:
- 10.48550/arXiv.1501.05203
- arXiv:
- arXiv:1501.05203
- Bibcode:
- 2015arXiv150105203C
- Keywords:
-
- Computer Science - Computation and Language
- E-Print:
- supplementary material of http://arxiv.org/abs/1501.04324. This version is identical to the Bachelor thesis of Geliang Chen archived on the 20th June 2013 in Peking University. Thesis advisor: Professor Jia Xu