Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Recurrent Neural Network
Abstract
Bidirectional Long Short-Term Memory Recurrent Neural Network (BLSTM-RNN) has been shown to be very effective for tagging sequential data, e.g. speech utterances or handwritten documents. While word embedding has been demoed as a powerful representation for characterizing the statistical properties of natural language. In this study, we propose to use BLSTM-RNN with word embedding for part-of-speech (POS) tagging task. When tested on Penn Treebank WSJ test set, a state-of-the-art performance of 97.40 tagging accuracy is achieved. Without using morphological features, this approach can also achieve a good performance comparable with the Stanford POS tagger.
- Publication:
-
arXiv e-prints
- Pub Date:
- October 2015
- DOI:
- 10.48550/arXiv.1510.06168
- arXiv:
- arXiv:1510.06168
- Bibcode:
- 2015arXiv151006168W
- Keywords:
-
- Computer Science - Computation and Language
- E-Print:
- rejected by ACL 2015 short, score: 4,3,2 (full is 5)