In this paper, we propose a method to adapt a general parser (Link Parser) to sublanguages, focusing on the parsing of texts in biology. Our main proposal is the use of terminology (identication and analysis of terms) in order to reduce the complexity of the text to be parsed. Several other strategies are explored and finally combined among which text normalization, lexicon and morpho-guessing module extensions and grammar rules adaptation. We compare the parsing results before and after these adaptations.
- Pub Date:
- June 2006
- Computer Science - Computation and Language;
- Computer Science - Information Retrieval;
- Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP'05) (2005) 89-93