WikiUMLS: Aligning UMLS to Wikipedia via Cross-lingual Neural Ranking
Abstract
We present our work on aligning the Unified Medical Language System (UMLS) to Wikipedia, to facilitate manual alignment of the two resources. We propose a cross-lingual neural reranking model to match a UMLS concept with a Wikipedia page, which achieves a recall@1 of 72%, a substantial improvement of 20% over word- and char-level BM25, enabling manual alignment with minimal effort. We release our resources, including ranked Wikipedia pages for 700k UMLS concepts, and WikiUMLS, a dataset for training and evaluation of alignment models between UMLS and Wikipedia. This will provide easier access to Wikipedia for health professionals, patients, and NLP systems, including in multilingual settings.
- Publication:
-
arXiv e-prints
- Pub Date:
- May 2020
- DOI:
- 10.48550/arXiv.2005.01281
- arXiv:
- arXiv:2005.01281
- Bibcode:
- 2020arXiv200501281R
- Keywords:
-
- Computer Science - Computation and Language