Learning Part-of-Speech Guessing Rules from Lexicon: Extension to Non-Concatenative Operations

doi:10.48550/arXiv.cmp-lg/9604025

Learning Part-of-Speech Guessing Rules from Lexicon: Extension to Non-Concatenative Operations

Mikheev, Andrei

One of the problems in part-of-speech tagging of real-word texts is that of unknown to the lexicon words. In Mikheev (ACL-96 cmp-lg/9604022), a technique for fully unsupervised statistical acquisition of rules which guess possible parts-of-speech for unknown words was proposed. One of the over-simplification assumed by this learning technique was the acquisition of morphological rules which obey only simple concatenative regularities of the main word with an affix. In this paper we extend this technique to the non-concatenative cases of suffixation and assess the gain in the performance.

Publication:

arXiv e-prints

Pub Date:

April 1996

DOI:

10.48550/arXiv.cmp-lg/9604025

arXiv:

arXiv:cmp-lg/9604025

Bibcode:

1996cmp.lg....4025M

Keywords:

Computer Science - Computation and Language

E-Print:

6 pages, LaTeX (colap.sty for COLING-96)

ADS

Learning Part-of-Speech Guessing Rules from Lexicon: Extension to Non-Concatenative Operations

Abstract