Chinese NER Using Lattice LSTM

doi:10.48550/arXiv.1805.02023

Chinese NER Using Lattice LSTM

We investigate a lattice-structured LSTM model for Chinese NER, which encodes a sequence of input characters as well as all potential words that match a lexicon. Compared with character-based methods, our model explicitly leverages word and word sequence information. Compared with word-based methods, lattice LSTM does not suffer from segmentation errors. Gated recurrent cells allow our model to choose the most relevant characters and words from a sentence for better NER results. Experiments on various datasets show that lattice LSTM outperforms both word-based and character-based LSTM baselines, achieving the best results.

Publication:

arXiv e-prints

Pub Date:

May 2018

DOI:

10.48550/arXiv.1805.02023

arXiv:

arXiv:1805.02023

Bibcode:

2018arXiv180502023Z

Keywords:

Computer Science - Computation and Language

E-Print:

Accepted at ACL 2018 as Long paper

NASA/ADS

Chinese NER Using Lattice LSTM

Abstract