Using Contextual Information for Sentence-level Morpheme Segmentation
Abstract
Recent advancements in morpheme segmentation primarily emphasize word-level segmentation, often neglecting the contextual relevance within the sentence. In this study, we redefine the morpheme segmentation task as a sequence-to-sequence problem, treating the entire sentence as input rather than isolating individual words. Our findings reveal that the multilingual model consistently exhibits superior performance compared to monolingual counterparts. While our model did not surpass the performance of the current state-of-the-art, it demonstrated comparable efficacy with high-resource languages while revealing limitations in low-resource language scenarios.
- Publication:
-
arXiv e-prints
- Pub Date:
- March 2024
- DOI:
- arXiv:
- arXiv:2403.15436
- Bibcode:
- 2024arXiv240315436B
- Keywords:
-
- Computer Science - Computation and Language
- E-Print:
- 6 pages, 3 tables