Integrating Multiple Knowledge Sources to Disambiguate Word Sense: An Exemplar-Based Approach
Abstract
In this paper, we present a new approach for word sense disambiguation (WSD) using an exemplar-based learning algorithm. This approach integrates a diverse set of knowledge sources to disambiguate word sense, including part of speech of neighboring words, morphological form, the unordered set of surrounding words, local collocations, and verb-object syntactic relation. We tested our WSD program, named {\sc Lexas}, on both a common data set used in previous work, as well as on a large sense-tagged corpus that we separately constructed. {\sc Lexas} achieves a higher accuracy on the common data set, and performs better than the most frequent heuristic on the highly ambiguous words in the large corpus tagged with the refined senses of {\sc WordNet}.
- Publication:
-
arXiv e-prints
- Pub Date:
- June 1996
- DOI:
- 10.48550/arXiv.cmp-lg/9606032
- arXiv:
- arXiv:cmp-lg/9606032
- Bibcode:
- 1996cmp.lg....6032T
- Keywords:
-
- Computer Science - Computation and Language
- E-Print:
- In Proceedings of ACL96, 8 pages