Massively Multi-Lingual Event Understanding: Extraction, Visualization, and Search
Abstract
In this paper, we present ISI-Clear, a state-of-the-art, cross-lingual, zero-shot event extraction system and accompanying user interface for event visualization & search. Using only English training data, ISI-Clear makes global events available on-demand, processing user-supplied text in 100 languages ranging from Afrikaans to Yiddish. We provide multiple event-centric views of extracted events, including both a graphical representation and a document-level summary. We also integrate existing cross-lingual search algorithms with event extraction capabilities to provide cross-lingual event-centric search, allowing English-speaking users to search over events automatically extracted from a corpus of non-English documents, using either English natural language queries (e.g. cholera outbreaks in Iran) or structured queries (e.g. find all events of type Disease-Outbreak with agent cholera and location Iran).
- Publication:
-
arXiv e-prints
- Pub Date:
- May 2023
- DOI:
- 10.48550/arXiv.2305.10561
- arXiv:
- arXiv:2305.10561
- Bibcode:
- 2023arXiv230510561J
- Keywords:
-
- Computer Science - Computation and Language
- E-Print:
- Accepted for ACL 2023