Similarity-Based Methods For Word Sense Disambiguation

doi:10.48550/arXiv.cmp-lg/9708010

Similarity-Based Methods For Word Sense Disambiguation

We compare four similarity-based estimation methods against back-off and maximum-likelihood estimation methods on a pseudo-word sense disambiguation task in which we controlled for both unigram and bigram frequency. The similarity-based methods perform up to 40% better on this particular task. We also conclude that events that occur only once in the training set have major impact on similarity-based estimates.

Publication:

arXiv e-prints

Pub Date:

August 1997

DOI:

10.48550/arXiv.cmp-lg/9708010

arXiv:

arXiv:cmp-lg/9708010

Bibcode:

1997cmp.lg....8010D

Keywords:

Computer Science - Computation and Language

E-Print:

7 pages, uses psfig.tex and aclap.sty

NASA/ADS

Similarity-Based Methods For Word Sense Disambiguation

Abstract